(This article originally appeared at Beyond the Box Score, where I am now a regular contributor)
2010 marks the end of the “ought” decade for Major League Baseball. I thought I would take the opportunity to analyze the last 10 years by visualizing team data. I used Tableau Public to create the visualization and pulled team data from ESPN.com (on-field statistics) and USA Today (team payroll).
The data is visualized through three dashboards. The first visualizes the relationship between run differential (RunDiff) and OPS differential (OPSDiff) as well as the cost per win for teams. The second visualization looks at expected wins and actual wins through a scatter plot. The size of each team’s bubble represents the absolute difference between their actual and expected wins. Teams lying above the trend line were less lucky than their counterparts below the trend line.The final tab in the visualization presents relevant data in table form and can be sorted and filtered along a number of dimensions.
The first visualization lists all 30 teams and provides their RunDiff, OPSDiff, wins, and cost per win for 2001-2010. The default view lists the averages per team over the past 10 years, but you can select a single year or range of years to examine averages over that time frame. The visualization also allows users to filter by whether teams made the playoffs, were division winners or wild card qualifiers, won a championship, or were in the AL or NL. The height of the bars corresponds to a team’s wins (or average wins a range of years). The color of the bars corresponds to a team’s cost per win–the darker green the bar the more costly a win was for a team. Total wins (or average for a range of years) is listed at the end of each bar. In order to create the bar graph I normalized the run and OPS differentials data (added the absolute value of each score + 20) to make sure there were no negative values. For the decade, run differential explained about 88% of the variation in wins and OPS differential explained about 89% of the variation in run differential.
The visualization illustrates the tight correlation between RunDiff and OPSDiff, as the respective bars for each team are generally equidistant from the center line creating an inverted V shape when sorted by RunDiff. In terms of average wins over the decade, there are few surprises as the Yankees, Red Sox, Cardinals, Angels, and Braves round out the top 5. However, St. Louis did a much better job at winning efficiently, as they paid less per win than the other winningest teams (<$1M per win).
(click for larger image)
The viz also illustrates the success of small market teams such as Oakland and Minnesota who both averaged roughly 88 wins while spending the 3rd and 4th least respectively per win. If you filter the visualization for teams that averaged over 85 wins during the decade, it really drives home how impressive those two teams’ front offices have been at assembling winning ball clubs with lower payrolls. No other team that averaged >85 wins paid less than $975K per win. Oakland looks even more impressive when you isolate the data for years that teams qualified for the playoffs. Oakland averaged 98.5 wins during seasons they made it to playoffs, and did so spending only $478K per win. Continue reading