, , , , , ,

[Fair warning: what follows is quite lengthy]

Well, it is performance review time at work and this reminded me of a post I’ve been meaning to write for a while.

An issue that has always interested me is how organizations measure individual performance.  Organizations have finite resources and therefore must deploy those resources in the most efficient manner, maximizing their value.  Given the large percentage of resources invested in personnel, organizations have a huge incentive to get those investments right.  However, calculating an accurate ROI on employees is probably one of the hardest things to do.  To explore why–and how it might be done better–I turn to the world of sports, baseball in particular.

There has long been a debate within the Sabermetric community (and between purists and Sabermetricians) regarding the statistical relevance of “clutch”: the ability of a player to elevate their performance in key situations in a manner significantly different from their performance in normal situations.  Early research by such pioneers as Bill James found that the attribute of clutch didn’t exist–much like the idea of a “hot-hand” in basketball, the appearance of a clutch performance (e.g. a usually mediocre batter managing to hit .500 in a playoff series) was nothing more than a statistical artifact.  If you were to look at any 5-7 game stretch during the 162-game regular season you are just as likely to find even average hitters going 4 for 8 or 8 for 16 as you would be to find them going 2-8 or 0-16.  Over a long enough time period, these streaks even out and players regress to their mean performance.  Basically, if you are a .250 hitter, over a long enough period of time your performance will return to its mean, despite occasional swings to the extreme left and right sides of the bell curve.

More recent studies have looked to expand upon earlier research and refine how we search for clutch performances.  A common way to do this is to not look at single games that were more important (e.g. post-season play), but rather particular moments that alter the probability of a team winning that particular game.  This approach has been termed “leverage“:

the swing in the possible change in win probability. If there is a game with one team leading by ten runs, the possible changes in win probability, whether the event is a home run or a double play, will be very close to negligible. That is, there won’t be much swing in any direction.

But, in a late and close game, the change in win probability among the various events will have rather wild swings. With a runner on first, two outs, down by one, and in the bottom of the ninth, the game can hinge on one swing of the bat—a home run and an out will both end the game, but with vastly different outcomes for the teams involved.

From left to right: A-Rod and Jeter

From left to right: A-Rod and Jeter

An excellent example of this debate over which players are clutch and which are not is the case of Alex Rodriguez. Long touted as one of the greatest hitters of all time, A-Rod has consistently been criticized as a non-clutch performer, accused of accumulating the bulk of his stats in low-leverage (i.e. low-pressure) situations. One could not listen to NY sports radio over the past 5 years without hearing fans decry the ‘superficial’ performance of the Yankee slugger while praising the clutchness of his statistically less impressive teammate, Derek Jeter. Part of the reason for this perception is the effect of high-impact events on perception and the framing of the debate by the media. But when we look at both players’ statistics in leveraged situations a much different picture emerges:

[As of 2008] Mr. Rodriguez has hit for the clutch throughout his remarkable, surefire Hall of Fame career. His career OPS in high-leverage situations is .975. In medium-leverage, it’s .960. And in low-leverage, it’s .972. That’s consistent with the American League as a whole during his career, when each year batters in high-leverage situations hit somewhere between 1% worse and 6% better than they did in low-leverage situations.

Additionally, if you look at the playoffs (where most look for clutch performance), A-Rod has accumulated the following statistics (including this year): .299BA/.393OBP/.958OPS. Here’s Jeter’s: .309BA/.379OBP/.858OPS. In series where his team is playing for a spot in the World Series, A-Rod’s stats are even better–Jeter’s go south.

The point is not to cheerlead for Alex Rodriquez, but rather to point out two points that can be applied to any organization and field when thinking about the value of personnel:

  1. When evaluating performance we should pay attention to results over the long term, not an arbitrary chunk of time. Anyone can ‘get hot’, but that doesn’t mean the individual willed themselves to a better performance or that current success is a predictor of future success. Discrete outcomes are quite dependent on factors outside of an individual’s control as well as the randomness of performance. This works both ways–for successes and for shortcomings.
  2. Single, high-impact events can significantly skew our perspective. Objectively evaluating the quality of individuals is difficult since we tend to use anecdotes as mental shortcuts. Big wins are important, but they shouldn’t blind us to the overall performance of an individual. Those big moments could easily be the result of random chance rather than talent. Now, we all benefit from luck from time to time. However, much time and treasure has been wasted on individuals who ‘came up big’ at one time and management is just waiting for them to rise to the occasion yet again. This applies equally to ‘hot’ baseball prospects, CEO’s, coaches, sales personnel, programmers, managers, etc.

So what should we do? How do we better measure human capital performance? I think partly we should take a page from baseball. New measurements have emerged over time to better evaluate player performance. Taking an analytical approach to measuring performance is step one. It shouldn’t be the only measurement, but without appropriate data organizations are simply basing their evaluations on subjectivity and intuition–you need a balance. With new measurements comes a new mindset; the prism through which we view the world changes so that instead of looking for those ‘big moments’ we now look for consistent performance over the long term. A healthy appreciation for randomness, chance, and methods for accurately measuring performance are all good first steps towards more accurately evaluating human capital.

For those that made it this far, thanks for sticking with me on this one.  Would love to hear your thoughts.