Author Archives: David Sparks

The Arbitrarian: lim(n–>today) p(Arbitrarian|n)=0

David Sparks is the Arbitrarian. His groundbreaking statistics column runs ran weekly here at HP. But now he’s moving on to working for an NBA team, while I’m stuck making Crazy Pills jokes with the rest of the animals. Sigh. Bastard. In all seriousness, David has been a joy to work with, and a great member of HP. We hope you’ve enjoyed his work as much as we have, and we wish him nothing but the best of luck in his future endeavors. He’s got a bright future in a league that discovers more and more every day the value of metrics. This is his farewell post. Do us a solid, and let David know how much you liked his stuff in the comments. Cheers, David. Once we’ve stopped crying over David and playing Peter Gabriel, we’ll work on getting a new stats columnist. Keep your eyes open.

At least for the time being, the Arbitrarian is going on hiatus. I have been honored with an offer to intern for a very successful professional basketball franchise, and so my statistical work will be for that team, no longer pro bono publico.

It has been an honor and a privilege to work with this group of excellent writers at HP, and it has been endlessly entertaining to engage with you readers in the process. I am sure that I have learned more than I have taught in the interaction–thank you for your patience and willingness to share your insight–I appreciated every comment, response, and e-mail I have received.

Special thanks goes to my co-bloggers here, who have been nothing but supportive; the fine thinkers at APBRmetrics (to whom I owe much); and most of all my wife, who has encouraged me in all my endeavors, even this one.

I hope you will join me in continuing to follow Hardwood Paroxysm and its peerless coverage of our favorite league, and I truly hope that you think about the world with a different, more Arbitrarian, mindset.

The Arbitrarian: Operationalizing interestingness

David Sparks is the Arbitrarian. His stats column runs weekly here on Hardwood Paroxysm.

Which teams were the most interesting last year, and which might be the most interesting this coming year? Obviously, there are innumerable ways to conceptualize “interestingness” in basketball–amount of interpersonal drama, perhaps, or exciting style of play–but today I’m going to present a series of different takes on interestingness, and apply these measures to the NBA. There’s no way we can cover all possible understandings of what makes a basketball team interesting, but hopefully I can offer several reasonable-sounding operationalizations…

Predicting game outcomes

First, since several of the following definitions of interestingness rely on it, I want to define a very simplistic and abstracted way of predicting game outcomes. First, imagine that you have two coins, each of which has a 50% chance of landing on heads, and a 50% chance of landing on tails. What are the odds of the various possible outcomes from flipping both simultaneously?

This is pretty straightforward, where we have two coins, A and B, and each can land Heads up or Tails up:


p(A=H&B=H) = 0.5*0.5 = 0.25
p(A=H&B=T) = 0.5*0.5 = 0.25
p(A=T&B=H) = 0.5*0.5 = 0.25
p(A=T&B=T) = 0.5*0.5 = 0.25

So, each H/T combination of two-coin outcomes has an equal chance of occurring, 25%.

What if A has a 60% chance of landing on Heads, and B has a 40% chance? Probabilities are somewhat different:


p(H,H) = 0.6*0.4 = 0.24
p(H,T) = 0.6*0.6 = 0.36
p(T,H) = 0.4*0.4 = 0.16
p(T,T) = 0.4*0.6 = 0.24

Here the probabilities are not all the same. Now, imagine that we say that a coin landing on Heads is the winner and a coin landing on Tails is the loser, and if they both land with the same face up, the two coins tie. Using the 60/40 probabilities above, we know that there is a 36% chance that A wins, a 16% chance that B wins, and a 48% chance that they tie. (Make sure this makes sense to you.)

Now, imagine that we disregard the ties (i.e. if they tie, we make them re-flip until there is no tie). What is the probability that A wins? Just take the probability from above (36%) and divide it by the universe of allowed outcomes (p(A wins) + p(B wins) = (36% + 16%) = 52%). You find 0.36/0.52 = 0.6923077. In other words, if you pair a 60% winning coin against a 40% winning coin, there is about a 69% chance that A lands on heads and B lands on tails, if you re-flip any ties.

How does this apply to basketball? Well, take Denver and Chicago from the 07-08 season. Denver’s winning percentage was about 60% and Chicago’s was about 40%. Assume that their winning percent is analogous to the weights assigned the coins above, that is, it’s the odds that they’ll win on any given night. Just as above, if we know that on one particular night one of these two teams won and the other lost, the chance that it was Denver that did the winning is about 69%, and the odds that it was Chicago who won are about 31%.

Now let’s pretend that a head-to-head game is just like the coin tossing face-off above–that is, it doesn’t matter how well the two teams match up, or who’s injured, or who’s on a hot streak, or who’s defense will smother the other team’s best scorer, or what have you. Imagine if, on game night, the owners of the two teams met at center court and flipped coins weighted according to their season-long winning percentage, and re-flipped ties… that’s how I would predict the chances of each team winning.

Now, in the Denver v. Chicago case, both teams have a chance of winning (69% and 31%, respectively), but Denver is more likely to win, because their odds are greater than 50%. So, if you had to predict a single game, you’d go with Denver. However, if the teams met 1,000 times, you would expect Denver to win roughly 690 times, and Chicago to win about 310 times.

And that’s how I’ll be constructing win probabilities for the remainder of the post. As long as neither team is undefeated or has literally no wins, there is always at least a slim chance that the underdog team can win. Applying this algorithm to a Boston v. Miami game in the 07-08 season, you’d get a 94.852% chance of Boston winning, meaning that it’s highly unlikely, but not impossible for Miami to have won that matchup.

Interestingness as unpredictability

If we know each team’s winning percentage, we can come up with predictions of a winner for each game in a season. If, for each game, we call the team with a greater than 50% chance of winning (this is, essentially, the team with the better win%) the predicted winner, when comparing these predictions to actual 2007-08 regular season outcomes, we find that we predict correctly 69.7% of the time, which is, at least, better than half, and probably better than many of ESPN’s experts.¹

One possible definition of interestingness is unpredictability–that is, if we know the outcome of the game before hand, the game itself is likely to be less interesting (this is why people who miss the live broadcast, but plan to watch it later on TiVo, don’t want to be told the score). So, which teams were the most and least predictable?


UTA2008 0.610
SAC2008 0.622
ATL2008 0.634
DAL2008 0.634
IND2008 0.634
NJN2008 0.634
PHI2008 0.646
WAS2008 0.659
CLE2008 0.671
POR2008 0.671
CHA2008 0.683
CHI2008 0.683
HOU2008 0.683
DEN2008 0.695
GSW2008 0.695
MIL2008 0.695
NYK2008 0.695
ORL2008 0.695
MIN2008 0.707
NOH2008 0.707
LAL2008 0.720
DET2008 0.732
LAC2008 0.732
SAS2008 0.732
TOR2008 0.732
PHO2008 0.744
SEA2008 0.756
MEM2008 0.780
BOS2008 0.805
MIA2008 0.817

Note that Miami, having the worst record in the league, would have been predicted to win none of its games, while Boston was predicted to win all of its games, because it always had the better record in its matchups. Miami game outcomes were correctly predicted 81.7% of the time, which is (1-win%), and Boston outcomes were correctly predicted 80.5% of the time, which was their winning percentage. In general, it’s easier to predict teams with more extreme winning percentages, because they are more (or less) likely to face teams with worse records. Close-to-.500 teams are the hardest to predict correctly (in general). Thus, it is instructive to contrast predictability with record, as I do in the graph below:

The huge outlier is Toronto, oddly enough. Despite their exactly 0.500 record, Toronto’s game outcomes were correctly predictable almost 3/4 of the time–surprisingly high. The biggest outlier at the other end of the spectrum is Utah, who despite having one of the best records in the league (which should lead to ease of prediction), were the least easy to correctly pick.

Interestingness as upsets

Unpredictability just means that a team lost games it “should have” won, and won games it shouldn’t have. However, for a fan, losing games that should be won adds more to angst than interest. Which teams did the best relative to their opposition–making their fans happy by winning games they were predicted to win, and upsetting opponents who should have beaten them?

To determine this, for each game played by each team, I estimate the team’s probability of winning using the above methodology. Then, depending on the actual outcome, I assign a binary 0/1 value for that game if they lost/won. To compute an “upset factor,” I subtract predicted probability of winning from the binary lost/won variable.

Thus, if a team has a 72% chance of winning (it is substantially better than its opponent), and wins, the upset factor for that game is (1-0.72) = 0.28. Had they lost, the upset factor would be (0 – 0.72) = -0.72. A team with very little chance of defeating it’s opponent, say 6% (like Miami’s odds against Boston), would get 0.94 if they won, but just -0.06 if they lost. Thus teams are rewarded for winning (and punished for losing), but proportionately to their projected odds of winning.

Over the course of the season, the best teams will beat most of their opponents, and so should generally have positive cumulative upset factor sums. The worst teams will lose more often, and so should generally have negative cumulative upset factors. However, some teams will defy their probabilities, and outperform (or underperform) expectations, and thus a bad team which manages to be an occasional “Giant Killer” may have a season-sum upset index that defies its record. How does this look for 07-08?


IND2008 -1.481
NYK2008 -1.389
MIL2008 -1.351
ATL2008 -1.343
MIA2008 -1.207
CHA2008 -1.116
NJN2008 -1.030
CHI2008 -0.897
MEM2008 -0.786
MIN2008 -0.734
TOR2008 -0.610
SEA2008 -0.554
WAS2008 -0.495
PHI2008 -0.429
LAC2008 -0.379
ORL2008 -0.213
BOS2008 -0.005
DET2008 0.000
CLE2008 0.035
SAC2008 0.410
POR2008 0.902
DEN2008 0.903
GSW2008 0.964
UTA2008 1.293
DAL2008 1.375
PHO2008 1.464
HOU2008 1.551
NOH2008 1.615
LAL2008 1.654
SAS2008 1.851

As you can see, the “most upsetting” teams are some of the league’s best, which in this context means that they beat teams they were expected to, and did not lose much to teams they should have beaten. In this respect, the Celtics are at a disadvantage, since based on their record, they should not have lost to anyone, and so every loss counts heavily against them.

One possible interpretation (and I stress “possible”) of these numbers is that the Spurs actually played 1.851 wins better than their record of 56 wins would indicate, given their opposition. The Celtics’ and Pistons’ actual records fairly accurately capture their ability given their opposition, and the Pacers’ 36 and 46 record is actually about a game-and-a-half too good, given how they lost to teams they should have defeated, and failed to upset many better teams.

Interestingness as potential

One final means of defining interest as we head into the 08-09 season, is potential. Every player, at any given time, has a certain level of productivity, and this level of productivity varies in generally predictable ways: usually it takes several years in the league to climb to peak productivity, which is maintained for several more years, before a decline sets in. Typically, players are their most productive in the middle of their careers–rarely do they peak in their rookie year, and even more rarely do they leave the league at the top of their game.

It is possible, then, to think of a player’s potential as their current productivity, given their age or experience in the league. Extremely valuable players, if they are very young, have more “potential” than extremely valuable players in their late-20s. This is part of the reason there is always so much excitement about rookies and rising stars–any amount of success they find early on is likely only to increase as they come into their prime.

At the other end of the spectrum, players in the middle of their careers, who have still not managed to become highly valuable, have very little potential. Of course, all of this varies. Tim Duncan, even at this relatively late age, is still likely to be valuable in the near future, even if he doesn’t necessarily improve. However, a General Manager might be more inclined to sign Chris Paul to a long-term contract than Jason Kidd, even if the two had been equally productive last year–Paul just has more potential, given the success he has found, and given his age.

Thus, we can estimate, for every player, some index of potential, essentially by dividing value (measured in MVP) by age. (Technically, I divide MVP/age at the per-game level, and multiply by the minutes-weighted mean age in the league (nearly 27), and then multiply this by 82, to estimate the trend of that player’s value.) When applied to the 07-08 season, we find the following estimates of potential:


(I’ve also thrown in the top-500 best-potential seasons from my dataset, which only includes 1986-2008, and so misses out on some really excellent rookie seasons. Apparently LeBron has lots of potential.)

Now, incorporating all the offseason moves, and using a magical formula that lets me convert MVP to team wins (Pythagorean 5.25), here are my projections (based only on this estimate of potential) for team success (in wins) at some future time:


MEM 13.8
SAC 21.7
NJN 23.1
DEN 23.6
PHO 27.7
GSW 28.0
MIL 28.6
OKC 28.9
CHI 31.6
MIA 33.3
TOR 33.5
POR 35.1
SAS 35.9
NYK 38.0
LAC 38.2
ORL 38.4
DET 38.8
CHA 46.7
BOS 48.0
ATL 48.7
MIN 50.7
WAS 52.2
CLE 53.8
PHI 56.0
IND 56.6
UTA 57.1
NOH 58.5
DAL 58.5
HOU 61.8
LAL 63.1

Notice all the hedging I did–it’s unclear whether these estimates should apply to next season, or several seasons down the road. I doubt, for example, that Phoenix and San Antonio will fall so far in 08-09, but you could imagine that, playing with these same rosters four or five years from now, the then-senior citizens on those teams would not fare so well. Also note that this doesn’t include anyone with no NBA stats–meaning that I haven’t incorporated the doubtless boon brought by Oden, Rose, Beasley, et al. That said, I can see the Lakers, Hornets, Rockets, Jazz, and 76ers being very interesting in the near future, and so this may not be all crazy.

Conclusion

I’d be very interested to hear if you like these conceptualizations and measures of interestingness, and especially if you think the measure of Potential has any merit at all. How would you measure interesting, if you had to use statistics? Does your impression of teams on the rise and teams on the decline mesh with the team success projections listed above? Let me know in the comments.

¹ Keep in mind that this prediction methodology is extremely simplified. It doesn’t take home court advantage into account, nor any interaction effects between the two teams. Obviously, adding in both of these would make the model more accurate, but if I had the time and ability to predict outcomes perfectly, I wouldn’t be sharing that knowledge with you, I would be gambling. So, please accept this approximation for the abstraction that it is.

The Arbitrarian: Team Depth, and Correlates Thereof

David Sparks is the Arbitrarian. His stats column runs weekly here at HP. This week he discusses depth and its impact.

The survey responses to last week’s post were so interesting, I decided to do an immediate follow-up (if you haven’t read it, you may want to do so before continuing here). Last week, we focused on team rotation size, as measured by minutes played. Today, we will look at a very similar, but somewhat more interesting concept: team depth.

Depth and rotation are not necessarily the same. Since there must be five players on the court per team at all times, the theoretical minimum for rotation size is five, which you would see if a team played only five players, all game, every game. However, depth concerns not playing time, but production, and it is easy to imagine one of those five players contributing more than 20% of the team’s total production, while one or more of the others produces less than their share. (There is a metric, called the Valuable Contributions Ratio, which I use to measure players’ productive contributions relative to their floor time.)

If each player produced in proportion to their allocation of minutes, it would make no difference which players were on the floor, but obviously this is not the case. Rather, better players produce a greater proportion of their team’s production than their proportion of a team’s minutes played. This implies, of course, that a team’s rotation size will likely not be the same as its productive depth, and further, that depth will likely be smaller than rotation.

In fact, depth can be calculated in exactly the same way as rotation (see last week’s column), except instead of using minutes as the variable of interest, we use Model-Estimated Value (MEV), a productivity metric.

So many theories

Last week, I invited readers to speculate about the relationship between rotation size and team success. You submitted countless interesting ideas in response to this question, and made many other interesting suggestions about ways to assess rotation consistency, variations in rotation size by coach, and differences between regular-season and playoff play, among others. I hope, in time, to investigate some of these great ideas.

For now, let us turn to the relationship between rotation size and success. In response to my question, the plurality of respondents said that wins and rotation size would positively correlate, many noting that deeper rotations would probably enhance a team’s chances in the playoffs.

Others suggested that the relationship would be negative, due to the fact that poorer teams needed to give more playing time to younger, weaker players, to aid in their development.

A large minority of answers indicated that there should be no consistent relationship. Several of these claimed that rotation size was too idiosyncratic: a function of the coach, playing style, and available personnel, and successful teams could make any sort of rotation a winner.

Several others predicted a parabolic relationship, in which the smallest rotations would find success on the back of a few stars, the largest rotations succeed through roster flexibility, and those in the middle, by failing to follow either strategy, will not do well.

I must admit that I was intrigued by all of these arguments, especially the parabolic prediction. My personal hypothesis was that increased rotation size would lead to greater success, due to the positive effects of diversification, as in the stock market. With more diverse contributions, I thought, would come greater insurance that even if one player failed to show up, one or more of his teammates would pick up the slack and ensure victory.

There were a number of other interesting hypotheses: one was that since defense requires a greater exertion of energy and offense requires time to find a rhythm, defense would correlate positively, and offense negatively, with rotation size. Other noted that faster-paced teams may require longer rotations, due to greater energy expended per minute. Several others suggested that the age of the team would vary positively with rotation size, as younger players can typically play a greater number of minutes without hurting productivity.

The empirical evidence

Who was most correct? Well, first I should mention that part of the problem with my question last week was that rotation size was often conflated with depth, which I define as separate concepts. That said, after reviewing the graphical relationships, I must sadly rule out the parabolic hypothesis. The rest of the relationships (between all suggested variables), are depicted in the correlation matrix below:


rotation depth gameage poss offeff defeff effdif
rotation 1.000 0.412 0.016 -0.069 -0.057 -0.083 0.020
depth 0.412 1.000 -0.007 0.079 0.375 -0.041 0.321
gameage 0.016 -0.007 1.000 -0.085 0.069 -0.143 0.164
poss -0.069 0.079 -0.085 1.000 0.016 0.016 0.000
offeff -0.057 0.375 0.069 0.016 1.000 0.160 0.648
defeff -0.083 -0.041 -0.143 0.016 0.160 1.000 -0.648
effdif 0.020 0.321 0.164 0.000 0.648 -0.648 1.000


Rotation and depth are measured as described previously. Game age is the playing-time-weighted age of the team. Possessions are a measure of pace. Offensive efficiency is a measure of a team’s scoring per possession, while defensive efficiency measures the same thing for their opponents (so better defensive teams have a lower defensive efficiency as constructed here). Efficiency difference is a measure of absolute quality, subtracting defensive from offensive efficiency.

Many of these results (the ones close to zero) indicate no relationship: Rotation size seems to be unrelated to anything but depth. However, depth appears to be positively correlated with offensive efficiency, and thereby, also positively correlated with efficiency differential–apparently teams with greater depth (at the per-game level) see improved efficiency differentials. One problem is that we cannot tell which direction causality moves in. Do deeper teams play better, or do teams who are winning by a lot give bench players increased minutes and thus increased time to produce?

To some extent, the likelihood of the second option can be tempered by the fact that rotation size has no real relationship with efficiency differential, but this question is still not definitively settled.

Expanding our scope

How have rotation sizes and depth changed over time? Which teams, historically, are the deepest? Due to data limitations, to investigate these questions, I must change the way I measure rotations and depth. Instead of assessing these at the per-game level, to make historical comparisons, I will measure at the season level, meaning that from this point on, rotation is best understood as the inverse of the concentration of minutes played over the course of the season, and depth is best understood as the inverse of the concentration of production over the course of the season. In general, these figures will be higher than each team’s mean per-game figures, due to changes in the roster and substitution patterns over the course of a season. However, error ought to be normally distributed, and so I will press forward using these slightly modified metrics, which are interesting enough in their own right.

As you can see in the plot above, both rotations and depth have increased over time. Rotation is denoted in red, and depth in cyan, and both are greater now than they were in the early years of the NBA. There could be any number of reasons for this–expansion, and the dilution of the talent pool, could be responsible; or merely a realization that heavy minutes’ loads may shorten player’s careers. Incidentally, I have scaled the size of each team-year marker to their winning percentage, but the relationship between depth, rotation, and winning is unclear in this depiction.

Below, I plot team winning percentage (jittered) against team depth. The color scale indicates rotation size, going from small (red) to large (blue), so that if you see a blue team amongst several red ones, you know that that team has a relatively large rotation given its depth. I’ve also scaled markers by year, so that more recent teams stand out more.


Fullscreen Version

The first thing I notice is the outliers. The most concentrated teams appear to be several Chamberlain squads, in which he was an absolutely dominant producer, and carried his team more than any other player ever has on a consistent basis.

The least concentrated teams are several more recent, and fairly bad teams, topped by the 2002 Chicago Bulls, who were very deep with potential that had yet to develop into actuality.

As noted above, depth has increased over time, and so it is interesting to note the most concentrated teams in a more modern era (which I mark with the inception of the three-pointer, 1979-present). There are two very shallow Utah teams, lead by Malone and Stockton, and supported by almost no one else. The pre-Pippen Bulls show up here, as do the Kobe-only Lakers–teams with one star who did a substantial amount of the producing. We also see the ’87 Celtics, ’04 Timberwolves, and ’08 Hornets, each of which had a couple of extremely good players dominating the contributions to winning, and then filled the rest of the roster out with players who couldn’t hope to match the same level of productivity.

Among the very best teams, there is a decent variety of concentration, although it is interesting to see the ’08 Celtics at the high end of depth among this elite. Their big three may have gotten the headlines, but it the entire roster made important contributions. Further down and to the right, we see the ’08 Rockets, which put on the least likely 22-game winning streak in history, on the back of role players, a different one of which stepped up every night. This team was very successful, given its depth, and it will be interesting to see how this translates to future success.

What does it mean?

The overall trend is a slight but definite negative relationship between team depth and success, but it is unclear what conclusions can be drawn from this. Is this proof that a superstar (or a Big Two, or a Big Three) is key? Does it reflect the fact that it’s easier to field a team of equally poor players than a team of equally excellent players?

Since this graphic is based on season-level data, it may just mean that teams with less volatility in their rotation and minimal personnel turnover are more successful. However, I must admit to being unsure of what to make of these preliminary findings. Should teams dump their midlevel players (in salary and productivity terms), in pursuit of a bimodal roster of two stars and ten inexpensive warm bodies? Obviously, constructing a roster requires more than just collecting players at varying levels of talent–the interaction of their abilities is a key consideration–a team is more than the sum of its parts. I would love to hear your insight, explanations, and questions in the comments. Also, I would appreciate your taking the time to fill out the short survey below.

The Arbitrarian: Empirical Estimates of Team Rotation Size

David Sparks is the Arbitrarian. His statistics column appears every Thursday here at Hardwood Paroxysm. This week he turns his attention towards the metric of minutes and rotations. Enjoy.

What does it mean to describe a basketball team’s rotation? Most commonly, teams are said to have somewhere between an eight and ten-man rotation, implying that N number of players (8, 9, 10, etc.) see significant playing time in a typical game. But what, more specifically, does this mean? What is “significant playing time”? What is a typical game? How can we know, from observation, how many players compose a team’s rotation? Do all teams use the same number of players in their rotations? Does the size of a team’s rotation vary much over the course of the season?

One difficulty with identifying team rotations is that it isn’t as simple as counting then number of players who appear in any given game. There is a subtle difference between identifying the number of players who might be used in a game versus the number of players in a rotation, and that difference mainly has to do with playing time.

As such, it is common to see a threshold of playing minutes employed to identify where the rotation ends. Perhaps the rotation is all players who see more than 10 minutes of playing time in a game… but perhaps the number should be eight minutes. Regardless of the cutoff employed, this method will give an authoritative-sounding answer, but using a minutes cutoff only means that rotation size is a function of the threshold chosen, which is a telltale sign of arbitrariness.

For example, imagine two teams: a team in which five starters play 40 minutes each, and then four additional players play 10 minutes each. Using a 10-minute rotation threshold, we would say that this team has a 9-player rotation. The second team has five starters play 41 minutes each, and four more players play 8.75 minutes each. Again employing the 10-minute cutoff, we note that this team has a 5-player rotation. Certainly there is a difference between the rotations of these two teams– the second team’s minutes are slightly more highly concentrated among the starters–but is the difference equal to a 4-man difference in rotation size? I would submit that it is not. This is a somewhat contrived and extreme example, but I hope it highlights the way in which such arbitrary definitions can be misleading.

Estimating the concentration of playing time

Fortunately, there exists a formalized measure of concentration which we can apply to basketball box score data. The Herfindahl Index is typically applied to the size of firms within an industry, but we can apply it to playing time of players within a game.

Essentially, we eschew choosing an arbitrary minutes threshold in favor of measuring playing time concentration. This avoids the robustness problems of a threshold and gives a continuous (non-exclusively integer) measure of each team’s rotation in a given game, arguably increasing both accuracy and precision.

Below I apply the Herfindahl measure of concentration to three very different games:


In the first game, Indiana played only six players in total. Of those six players, five played more than 40 minutes, and the sixth played 15. The Herfindahl Index of concentration for this game is a very high 0.18.

In the second game, 12 players saw 12 or more minutes, with the most playing time being 23 minutes, which is less than half of a game. As you might expect, the Herfindahl Index is much lower here (the index increases with concentration), just 0.09.

The third game sees Minnesota employ 14 players in their quest to defeat Houston. Based on this alone, you might expect even less concentration than in the Sacramento game. However, the distribution of minutes here is much less uniform than above. Seven players saw over 20 minutes of action, the other seven saw less then ten minutes, and four of these had almost negligible floor time. As a result the Herfindahl Index is somewhere between the two games above, at 0.12.

Translating concentrations into rotations

Numbers like 0.18, 0.09, and 0.12 are useful in that they make comparisons easy and consistent, but they bear little resemblance to anything like the rotation numbers we might hope to see, which we’re expecting to be in the high single digits, or low double digits.

The conversion from index to rotation, fortunately, is a simple one. Merely take the inverse of the Herfindahl Index, as calculated above, to find the Rotation Index (RX). In our examples above, Herfindahl Indices of 0.18, 0.09, and 0.12 translate to rotations of size 5.55, 11.72, and 8.58, respectively. I think that you will agree that these numbers look very much like what you would subjectively conclude after viewing the above minute distributions.

Indiana’s six players, with five dominating, ought to come out somewhere between five and six. Sacramento’s nearly even distribution among 12 players should be very close to 12 (although not exactly, because playing times were not identical across the board). Seven Minnesota players saw substantial playing time, and the remainder, collectively (with 32 minutes between them), played enough to count for about 1.5 players if considered as a unit.

Typical team rotations

The above are some extreme examples of rotation size, measured at the game level. However, “a team’s rotation,” while best assessed at the game level, is better represented summarily over the course of the season than with a single observation. Thus, calculating Rotation Indices for each team, each game, we can come up with an average rotation size over the course of a season, as a better indicator of typical rotation size for a given team.

Below I list, for each of 614 team-seasons in my dataset, a listing of typical rotation sizes. I include the average number of players per game who saw any playing time for comparison, as well as standard deviation of rotation size, to give an idea of how variable was the team’s rotation. I have sorted the entries by rotation size, smallest to largest, and highlighted teams from the 2007-08 season, to make them easier to locate.


The first thing to note is that the 1986-87 Celtics top the list, followed by that season’s Philadelphia team. Perhaps this is not surprising: the three top Celtics by minutes played that season were Hall-of-Famers, and that team featured Bill Walton, another HoF-er (though he did not play many minutes). Philadelphia featured two Hall-of-Fame players in Barkley and Erving. Small wonder that these teams gave many minutes to players with so much talent. The 2005 Phoenix team was the most successful of the D’Antoni era, and notice also that all five of the most recent Phoenix squads appear in the smallest 42 of this list (four in the top 21–age and O’Neal must have taken their toll on D’Antoni’s famously short rotation in 2008).

At the other end of the spectrum, we see five recent San Antonio teams in the top 64 biggest rotations–perhaps this is a small part of the reason that PHO v SAS is always so compelling: along with their very different playing styles and pace, San Antonio uses one of the league’s largest rotations, while Phoenix goes with a very small one. There are also nine Utah teams among the 42 biggest rotations, which would seem to indicate a deliberate pattern.

Conclusion

It would appear possible to develop a non-arbitrary and unbiased estimate of team rotation sizes using Herfindahl Indices of concentration of playing time. The season-average results appear to correspond well to common subjective assessments of rotation sizes, ranging from as low as just over seven to as high as just under ten. The league-typical rotation size, by this measure, also aligns with our more qualitative expectations: mean Rotation Index from 1986-2008 is 8.174, and the median is 8.038.

Based solely on these results, it is difficult to discern whether smaller or larger rotations correlate with success. Many good teams appear to have small rotations, but many other good teams appear to have large rotations. In the questionnaire below, I ask your predictions as to the relationship between rotation size and winning, and your reasoning. I hope you will respond.

How well do you feel this metric accurately captures rotation size? Do the figures assigned to each team mesh with your own impressions? Please take a moment to answer the survey, and feel free to leave any questions or observations you might have as comments on this post.







The Arbitrarian: Envisioning the Olympics

David Sparks is The Arbitrarian. He profiles his stastical work every Thursday here at Hardwood Paroxysm. David is glad to be back at school, especially with his new Trapper Keeper and abacus. This week he takes a look back at Olympic Basketball and the ramifications the numbers supply within.

Admittedly, it’s a little late for Olympic basketball coverage, given that the competition ended sometime around 4:00 am EST on Sunday morning, but Thursday is Arbitrarian day, and so today I’m going to try to tell the story of the US Men’s Olympic basketball team retrospectively, in statistics and graphics.

Predecessors

This most recent iteration of the US men’s basketball team was slated to “redeem” the American program in international competition. After several successive failures to dominate their competition, much was made about the degree to which the rest of the world had caught up to the level of American basketball and/or how the American players, because of [insert arbitrary reason here] would no longer be able to dominate in international competition. Several of the more recent US squads were derided as selfish, non-fundamentally sound, failing to take international competition seriously–the narrative was one of how hubris could lead even the mightiest to fall.

It has been said that during those dark years, the US was “just fielding all-star teams,” and that part of Jerry Colangelo’s plan for a return to dominance was to field carefully constructed teams, with role players and specialists–not just 12 guys who could score. To what extent is this true? How much credit does Colangelo’s craftsmanship deserve? As we like to do here, let’s take this subjective claim, and apply a little bit of rigor to see if it holds up without the patriotic feelings and stirring redemption narrative clouding our judgment. For answers, let us look to an application of the SPI style trichotomy:


(Note: If you turn captions on (second button from left on bottom), each diagram is labeled with its year. Also, hit pause and use the arrows to review each image at your own pace.)

Above is a series of graphics depicting the SPI styles (based on their NBA statistics) of each team fielded by the US in major international competition, from the Dream Team in 1992, to this year’s “Redeem” Team, with the exception of the 1998 World Championship team, which was largely composed of non-NBA players.

What differences can we identify in each team’s composition? Did Colangelo really put together a thoughtfully composed team? It appears to me that this was at least some part of the difference between this year’s team and those recent teams that ended in failure and disappointment. The main thing I notice, in comparing the 2002, ’04, and ’06 teams (although especially the first two) to each of the others, is a relative dearth in the pure perimeter region.

Each of these teams has an eclectic smattering of interior types–some years they appear more offensively-minded than others, and the 2008 Olympic team, interestingly has only three players classified as such in the SPI scheme. But look first at the 1992 team, which is stacked to the gills with players in the 10 o’clock to 12 o’clock range, meaning that their statistics indicate a focus on perimeter play, or an absence of focus on scoring, relative to the league. Such is the case, to a slightly lesser extent, with each of the other teams up through 2000.

In 2002, the perimeter appears to have become less of a priority, stocked with
Andre Miller, Davis, and young Jay Williams–good players, but not the “pure point” types which manned some of the other teams. Further, that team was full of Perimeter Scorer types, three of which (Reggie Miller, Finley, and Allen), are known more for their shooting than their all-around game.

2004 may have been an even more poorly-constructed team, with essentially no Pure Perimeter players. James and Wade are capable of facilitating, but this is not typically their primary role, and James played relatively few minutes anyway. Instead, that role was left mainly to Marbury and Iverson, who are known to look for their own shot as often as they pass–and this subjective reputation is backed up by the SPI analysis.

The 2006 team was much better–it is obvious that effort was made to compose a team of players of many different types–this is the only year in which there is at least one player from each sextant of the SPI plot. This is not necessarily a good thing for winning, but it indicates that thought was put into how each player would fit together into a whole. Further, two actual perimeter players were included, Paul and Hinrich, and this team performed substantially better than their Marbury- and Iverson-lead predecessors.

This year’s team sees a return to past glory, likely in no small part to a fully-stocked trio of Pure Perimeter players, able to push the ball up court and facilitate any of the able scorers on the team. Interior play was de-emphasized, as the team’s focus would be on a disruptive defensive style aimed at generating turnovers and leading to fast breaks–for this, speed, not size, was key.

In sum, it appears as though part of the credit for the USA’s Olympic success really might belong to Mr. Colangelo. Though it is the players on the floor who do the actual winning and losing, a large part of the results likely stemmed from what happened way before the opening tip.

Now that we have covered the pre-Olympic preparation phase, let us turn our attention to what actually happened in Beijing.

Assessing productivity in these Games

Due to limitations on the ease with which game-by-game data can be collected for the Olympic tournament, I will be discussing productivity (as measured by MEV) rather than value (as measured by MVP)–but here, the story is pretty clear.¹ Below is a list of each athlete, with their SPI factors, points- and MEV-per game numbers, and Valuable Contributions Ratio. I’ve also included what I call Points Per Points Possible (p4), which divides points scored by the number of points possible on each of their shot attempts (2 for all field goal attempts, plus an extra one on three-point attempts, plus one for each free throw attempt).


Many of the most productive individuals play professionally in the NBA. These numbers indicate that LeBron James was the most valuable to Team USA, but note that Wade was almost as productive in substantially fewer minutes (his VCR is the highest on the US team). As such, I have to name James the MVP (for the team and the whole tournament), but Wade is the US’s Most Efficient Player, which is exactly what the team needed from its first man off the bench.

How did contributions break down for each team? Below is a series of charts that plot the sources of production for each team, based on tournament-cumulative MEV. Each player is colored according to their SPI type, and players with negative MEV are zeroed out (because it’s hard to depict the area of a negative number):


Click here if you want a whole window full of these pie charts.

Among the best teams in the competition, Argentina was more highly dependent on their top-tier players than were Spain and the US. The two teams most reliant on a single player were China, anchored by Yao Ming, and Iran, lead by Ehadadi. Croatia appears to have had the most balanced contributions, although this is often a trait of weaker teams, because it is easier to field a team of equally poor players than one of equally excellent players.

What did each player produce individually? The table above gives the summary report of the points-value of each player’s production, in the form of MEV. Below, however, I have the complete breakdown of each player’s counting statistics for the Olympic tournament, as a percentage of the simple sum of these stats for that player. I have tried to arrange the graphs such that adjacent areas make for easy comparison of paired statistics–missed field goals is next to points, assists next to turnovers, offensive and defensive rebounds together, followed by the defensive statistics, etc. Players are sorted by MEV/gp. Coloration is of course derived from SPI type based on Olympic statistics.


Click here if you want a whole window full of these little pie charts.

Seeing these pie charts all together as small multiples allows us to easily compare two or more players. Note, for example, that Dwight Howard and Chris Bosh were almost perfect substitutes for one another: they have almost identical per-game MEVs, and their stat distributions look very similar–the only exception seems to be that Bosh seems to have grabbed relatively more defensive rebounds and turned the ball over slightly more, while Howard did a lot more fouling.

Carlos Delfino’s SPI color identifies him as a very tournament-representative player; that is, his relative distribution of scoring, perimeter, and interior statistics reflect that of all players collectively. The gray color indicates this league-relative neutrality, and he serves as a useful benchmark against which to compare others.

As is evidenced by his orange color and large segment devoted to pts and fgx, a large portion of Bryant’s statistical contributions came from scoring. However, these statistics likely do not give the full picture for Bryant, as his role for most of the duration of the tournament was to shut down the opposition’s best players, not unlike a “Doberman.”

Jason Kidd (very pale blue, about halfway down) is one of few players for whom pts is not the largest segment. Rather his defensive rebounds and assists took priority, although so too, unfortunately, did his turnovers and personal fouls.

Michael Redd (rusty color, much closer to the bottom of the list) offers an interesting example of the usefulness of such a visualization. The first thing one notices is that his pts sector is matched in size by his fgx sector–he missed almost as many baskets as he scored points. Tip for the uninitiated: this is not a productive way to play basketball.

Another way to look at the data is through parallel coordinate plots, which are useful for depicting the rank of an individual across multiple categories. Below, I present PC plots for each member of team USA, where the vertical axis indicates that individual’s rank in each of 9 metrics, relative to the entire pool of Olympic players. On each plot, for ease of comparison, I draw gray lines for the remainder of the US team, but highlight each player individually in their SPI color.


Click here if you want a whole window full of these parallel coordinate plots.

p4 is Points Per Points Possible, described above, AS:TO is the assist-to-turnover ratio, TR/min is total rebounds per minute, DEF:PF is (BK+ST)/PF, which is just an amateurish way of measuring defensive skill.

Looking at these plots, we can see that Wade performed very well. He is in the top four on the US team in each stat, and it is apparent that he is in the top half across the board among all Olympians. Redd, although he was called upon to provide a shooting spark off the bench, was mostly a dud, with a p4 among the lowest in the competition. Bryant was second lowest on the team, but his shooting efficiency looks to have been better than about a third of the Olympic players, and thus much better than Redd’s. Note that due to a small sample size, some of these ranks will appear odd, namely Redd’s high ranking on the DEF:PF statistic and Kidd’s high p4 rating. Neither of high rankings are what we would expect from these players, but Redd played relatively few minutes, and Kidd only took shots he couldn’t refuse to take, resulting in good ratings for in these areas over a small number of observations.

I would be very interested to hear any more insights you glean from the above displays–feel free to copy any of the charts for your own use, just also please provide a link back to HP.

Olympic style

We’ve seen the NBA styles of the players that make up Team USA, we’ve seen their SPI factors and even their specific statistical breakdown. Now, we turn to a full SPI Spectrum graphic depicting each Olympic competitor, and their type, based solely on their production in the Olympics. Player names are scaled according to their MEV totals, so that the most productive players are the easiest to spot.


Fullscreen Version

Several things stand out to me. First, I am impressed by the degree to which this Olympics-based diagram matches up with the NBA-based diagram, for players who appear in both. Redd, Bryant, Williams, Kirilenko, Howard, Yao and Boozer all played similar styles in these Olympic games as they did in the 07-08 NBA.

Even more enlightening are the differences: Louis Scola played much more of a scoring role for Argentina than he does for the Rockets (understandably so). Dwyane Wade and Chris Paul shifted their focus away from scoring, relative to their NBA style, likely because they were not required on this team to carry their team’s point production. Anthony’s purported focus on rebounding is reflected in his shift from a somewhat perimeter-biased Scorer to an Interior Scoring type. Jason Kidd became an even more extreme Scorer’s Opposite, eschewing shooting opportunities whenever possible.

The most significant shift, however, might be seen in the play of LeBron James. Last season in the NBA, James lined up at about 12 o’clock on the diagram; the style with which he most closely aligned was Perimeter Scorer. In these Olympics, however, James’ style reflects his commitment to doing whatever was needed by the team. His minty-green color and placement at a little before 11 o’clock reflect his Pure Perimeter style, though his relative proximity to the center of the diagram indicates that his fit here is not perfect. Rather than being the primary scorer for this team, as he is accustomed to being in Cleveland, James stepped up the defensive intensity, leading his team in blocks (with eight), and finishing second in the tournament in steals (with 19!), not to mention leading the tournament, by a landslide, in menacing scowls. Further, he was second on the US team in assists (30; Paul had 33), his assist-to-turnover ratio was a respectable 1.76, and he finished in the tournament top ten in total rebounds. To put it in perspective, the role James filled for this US team was similar to that played by Magic Johnson on the showtime Lakers, which is quite a niche, indeed.

Conclusion

In sum, we can see that at least some of the hype is true. There has been some well-placed cynicism regarding the extent to which the “Redeem Team,” and our collective impression thereof, is a product of marketing. I have no doubt that at least some of what we believe about this team and its players is fabricated for the purpose of generating a positive image, and greater sales. However, at least two claims made about this team can be empirically verified, and I have tried to do that here.

The first claim is that this team is different from the failures which came before. Using NBA statistics and the SPI Typology, I am inclined to believe that in construction, this team is different than its three previous iterations, and more similar in design to the Dream Teams of the 1990s.

The second claim is that the players on this team changed their styles to accomodate each other, to better fit together as a team. Comparing SPI positions in the Olympics to SPI positions in the NBA, we can see which players had similar statistical distributions, and those which modified their style. Each player on the US team was either accustomed to or able to lead their NBA teams in scoring on any given night, and in Olympic competition, this ability to rely on others to score allows (at least theoretically) unselfish play. The question was always whether or not this team of able shooters would be able to “put aside their egos” and fill a specific role for this team, which may or may not include a substantial amount of offensive production. By and large, it appears as though the players asked to do so have responded positively. Though several US team members played with styles similar to their NBA styles, this reflected the reported desire of the coaching staff and management of the team (i.e. Michael Redd is supposed to be a shooter). Other players saw drastic shifts in their style of play, especially movement away from a focus on scoring, as a universally capable offense permitted each individual to do less of the shooting than may be required on their NBA squads. Based on this graphical evidence, I am willing to advance a tentative rejection of the null hypothesis that the players did not fill the roles they were asked to. Rather, it appears as though they played as a cohesive unit, maximizing their strengths and possibly sacrificing for the team.

I hope this late coverage was worth waiting for. I would be very interested in hearing your reactions to any of the ideas I’ve put forward, and I would especially like to know if you see any interesting relationships jump out in any of the SPI diagrams. I haven’t even begun here to discuss the interesting similarities between several of the international players and those from our own NBA in the Olympics, I suppose I will leave that to you. As usual, I’d love to hear from you in the comments, and in the survey, and please Buzz this up!







¹ If you are particularly interested in game-by-game contributions and value, I did track a modified version of MVP for team USA throughout the Olympics and pre-Games warmups. You can see the per-game and cumulative results here.

The Arbitrarian: Assigning Credit for Game Outcomes

David Sparks is the Arbitrarian. His column appears every Thursday here at Hardwood Paroxysm. This week’s stattastic column regards an elaboration on the Box Scores measure he discussed previously. Your feedback is welcome in the comments, puny human… I mean… dear friends.

Two weeks ago, we explored a statistical estimator of value, BoxScores, which estimates player contributions to team success at the season level. Aside from the time-honored complaint that it doesn’t account for defense, there are at least two other improvements that I might wish to make to improve the accuracy of this value estimator.

First is the problem of trades, and more generally, varying team success across the duration of the season. As it stands now, if player A is traded from team X to team Y in the middle of the season, his BoxScores are calculated by finding his PVC to team X’s entire season’s worth of MEV and multiplying that by X’s entire season’s worth of wins; then adding to that the same calculation for team Y to find the player’s season-cumulative BoxScores figure. This is good enough, for an estimate.

However, imagine if both teams are made significantly better by player A. Team X might be on pace for a very successful season up until the trade, and might begin to tank once he leaves. Team Y may have had an inauspicious start, but with the addition of player A, they might turn the season around. If this is the case, player A might be responsible for more success than his BoxScores indicate. Alternatively, similar situations can be envisioned in which much-injured players’ contributions are over- or under-estimated, since BoxScores (using season-level counting statistics) cannot account for game-level success and variations thereof.

Another problem is with comparability, especially comparisons of good players on bad teams to good players on good teams. According to BoxScores, Al Jefferson was less valuable in 2007-08 than was Andris Biedrins. This could be true, but it could be that while Al Jefferson did more every game to help his team win, he could not, (essentially) alone, carry his team enough to get very many wins. The point was made by a commenter on a previous post that a team of Michal Jordan and eleven pre-schoolers would never win an NBA game, though Jordan could be incredibly productive. BoxScores, multiplying productivity by success, would assign Jordan and his eleven weaker teammates the same value: 0. This is certainly an extreme example, but it highlights a possible shortcoming in the BoxScores methodology–wins are discrete, binary events. Either a team wins a game, or it does not. Regardless of whether the score was 101-100, or 130-70, a win counts the same.

The solution, in the form of a more specific metric

The appeal of BoxScores has been (among other things) that it can be applied to every professional basketball player, because season-level box score stats are very widely available. The downside to a more specific, game-level estimator is that the increased accuracy comes at the cost of universality: Game-by-game box score statistics are only available going back to the 1986-87 season. Nevertheless, here I will develop a value estimator that works at the game level, to give us an even more accurate picture of just how much each player contributes.

For each game, we first must calculate each player’s MEV. (See this post for a very detailed description of how this is done.) Then we calculate each player’s Marginal Victories Produced (MVP):

MVP = Player MEV / total MEV sum for both teams

As you can see, in each game, there is a total of 1.00 MVP to be allocated. Each individual’s contribution to the total production in the game is considered their Marginal Victory Production. This way, players on losing teams can be seen as producing valuable contributions–they might be valuable enough to get their team right to the cusp of victory–and this value shows up in MVP (but not in BoxScores).

Here is an example of MVP calculated for a game on April 11, 2008, between the LA Lakers and New Orleans Hornets:

The Lakers won, 107-104. Total MEV for the Lakers was 110.7, and for the Hornets, it was 106.0, so the Lakers’ total MVP allocation was 0.511, versus the Hornets’ 0.489. If we were focusing on wins and losses alone, the Lakers would get 100% of the credit for this game. Arguably, though, the Hornets produced something of value here–they got within four points of winning, and thus MVP is a much more accurate estimator of value.

One interesting way to think of MVP numbers is to note that a team needs a total of at least 0.5 MVP to win a game.¹ Thus, in the game detailed above, Bryant got his team almost a third of the way to the win (0.165 MVP), Paul/Chandler/Stojakovic together got their team 2/3 of the way to a win (0.335 MVP), etc.

MVP value at the season level

To estimate a player’s value for the duration of a season or career, we need only sum their game-level MVP. One nice property of MVP is that the sum total of MVP is equal to the total number of games played–the “value of each game” is divided among each participant, so that all games are accounted for in their entirety. Further, team season-total MVP can be translated to wins and losses by a method similar to the Pythagorean win projection (more on this sometime in the future). How many marginal victories did your favorite players produce? See below…

The first tab (“07-08 MVP”) lists the total number of MVP for each player last season. I would argue that this is a valid way of identifying the league Most Valuable Player. Just as in the BoxScores rankings, Chris Paul comes out on top, followed by LeBron James and Kobe Bryant. However, the differences in the two estimators can be instructive. According to BoxScores, Al Jefferson is the 74th most valuable player–by MVP, Jefferson is 11th, just behind the player for whom he was traded, Kevin Garnett. Dwyane Wade moves from 188th most valuable (BXS) to 67th (MVP) for his his injury-shortened season. Good players on bad teams are not “punished” for having low-quality teammates. Rather, everyone is rewarded based on their contributions to competitiveness, even if that competitiveness doesn’t result in winning every time.

The “86-08 MVP Seasons” tab lists just that–the most valuable seasons from my limited dataset according to MVP. Unsurprisingly, Jordan dominates this list, along with other modern luminaries. Keep in mind that the MVP number is not a number of wins–it’s “Marginal Victories”–but also keep in mind that teams need only 0.5 total MVP in a game to win it. One way, thus, to look at season-total MVP numbers is to say that, for example, Jordan in 87-88 contributed enough MVP to help his team win the equivalent of about 27 (13.52 / 0.5) games. Bear in mind, though, that this is just an interesting shorthand, because summing this figure for each team will not come close to matching that team’s win total. If Jordan had accumulated 0.5 MVP in each of 27 games, and sat out the rest of the season, his team would have won each of those games, and he’d be credited with 13.5 MVP. However, Jordan played for a whole season, and accumulated MVP in pieces (never as many as 0.5 at a time–no player won any game “single-handedly”), so the 27 win estimate is interesting, but not literal.

The final tab, “86-08 MVP Careers,” lists the most valuable players during the period covered by the data set. Thus, many of Larry Bird’s and Magic Johnson’s best years are excluded, as are the first years of Jordan, Olajuwon, Stockton, etc. since they came prior to the 86-87 season. This is important to keep in mind when viewing game-average MVP numbers. Larry Bird falls relatively low on the list in no small part because we’re comparing his later years to the primes of LeBron James, Chris Paul, and Dwyane Wade.

Bearing this in mind, the list is still highly instructive. Since it takes a total of 0.5 MVP to win the game, players from Jordan down to Garnett are generating at least a quarter of the value their teams need to win (0.125 / 0.5= 0.25). Players with MVP over 0.1 are doing more than a fifth of the work needed to get a win, and since no player plays over a fifth of his team’s minutes, these are obviously some of the most valuable players–overrepresented in value relative to playing time. The rankings on this list are unsurprising, and read like a roll-call of the best players of the last 20 years. These are the guys around which you’d want to build a team.

Greatest single-game performances

What’s the point of having a game-by-game data set if you don’t look at game-by-game value? Below is a list of the 100 most valuable performances of the 07-08 season, and the 500 most valuable performances from 1986-2008. These are the herculean efforts from which legends are made. Here we can see how MVP automatically adjusts for pace, and assigns value above and beyond MEV’s measure of productivity. Since MVP is a percent of total production, it makes no difference how fast the game is played, how long the game is, or how much is produced in total, contributions to winning are measured against the other players in the game. Also, since as opponents’ MEV decreases, a player’s MVP increases, the better the player contributes defensively (i.e. outside of his box score stats, but visible in the other team’s production), the better will be his MVP. The margin column, incidentally, indicates the final point spread in favor of a given player’s team. If it is negative, that player’s team lost the game.

Both lists are topped by players you might expect to be there, but interspersed are some surprises: John Salmons? Willie Burton? It goes to show that on any night, any player can be a hero, and that a single sample can be very misleading. Nevertheless, there is a lot of data to be gleaned here. Note that the best games played see players generating over a third of the total production, which gets their team 2/3 of the way to a win. Not even the greatest can win completely on their own.

I’d like to digress here briefly, on the subject of Kobe’s 81 point game. Note that he produced about 1/3 of the total valuable contributions in that game, but look at his MEV: 68.96. That means that by missing 18 field goals, and doing very little other than shooting, he cost his team about 12 points in the final margin. The Lakers still won by 18 points, but to me the 81 point achievement is somewhat underwhelming, because of what it took to get there. Edit: Apparently, you put it one little paragraph about Kobe Bryant, and it makes your whole post about Kobe Bryant… All I’m trying to say here is that Kobe, by missing 18 shots (and turning the ball over, while not doing a lot of rebounding or box score defending) cost his team a few points. Most players couldn’t dream of generating 69 points, and this is an impressive feat, but also, most other players don’t even take 18 shots (doing so would put them in the 94th percentile of all games in the data set). All I’m saying is that it might be somewhat less impressive than some of the others on the list, like, for example, Jordan’s incredible performance against Cleveland.

The future

In the future, I plan on developing an approximation of MVP based on season-level statistics, for those seasons in which game-by-game data is unavailable. Next week, I am planning on applying some of the methods discussed here to the performance of the US Men’s Olympic basketball team. Today, I have three requests for you: First, please leave insights or any questions you might have in this post’s comments. Second, please take a moment to fill out the survey below with your thoughts, ideas, and criticisms. Third, if you found this post interesting, click the little “Buzz up!” button below, to express your approval.

¹ Game-total MEV margins correlate with game-level point margins at 0.947, and looking at MEV winners correctly classifies actual point winners 92% of the time.

The Arbitrarian: A generalized continuous typology of playing styles

The Arbitrarian column is written weekly by David Sparks. You can read more of his work at his own blog. This week’s head asploding column is on player positions versus styles, and what they’re all about. Note to feed subscribers: The graphics in today’s post won’t come across in the feed, so you might want to click through to the original at HP.

What does it mean to be a Point Guard? Typically, point guards are expected to carry the ball up the court, set up the offense, make passes, and take few shots, at least relative to other players on the court. But how much can the term “point guard” actually mean if it applies to both Jason Kidd and Baron Davis? Further, what does it mean to be a “small forward” if Dominique Wilkins, LeBron James and Shane Battier all fall into that category? What do Vlade Divac and Amare Stoudemire have in common, aside from both being called “centers”?

The obvious point is that traditional position classifications, while they mean something, still convey relatively little information about a player’s function on the court. As observers of the game, we attempt to compensate for this by adding any number of modifiers to these position descriptions: combo guard, pure point guard, defensive center, swingman, etc. Each of these is used to more accurately specify a player’s style or role on the team, yet each is still somewhat definitionally ambiguous and subjective by design. One Tom Ziller has done some work in attempting to statistically classify guards on a continuum between “small two-guards” and “pure points,” but this is only a small first step in the right direction. I present here a generalized methodology for structuring a playing style spectrum, and identifying each player’s position within the continuum. By looking at actual statistics produced, we may eschew fuzzy descriptors of position and style in favor of a very specific, yet still highly flexible system of style identification–which provides us with an improved vocabulary with which to describe, among many other things, player types and team styles.

Very rudimentary factor and cluster analysis I performed a long time ago indicated that there are distinctions in the data between players who tend to try to score a lot, those who play a “smaller” game, and those who play like “big men.” In terms of the NBA’s tracked counting statistics, this translates to a differentiation between those who specialize in points and field goal attempts, rebounds and blocks, and steals and assists. I have chosen to call each of these three tendencies Scorer, Perimeter, and Interior, and collectively they form the SPI Style Trichotomy.

Calculation

To identify each player’s style is conceptually simple, but computationally somewhat more complex. Essentially, one sums each player’s fga + tr + bk + as + st, and determines what percentage of the total each SPI factor constitutes:

  • Scorer percentage = fga / (fga + tr + bk + as + st)
  • Perimeter percentage = (as + st) / (fga + tr + bk + as + st)
  • Interior percentage = (tr + bk) / (fga + tr + bk + as + st)

These numbers are interesting on their own, but for the calculation of an index of style, they require further manipulation. In the league as a whole, the Scorer percentage is around 50%, the Perimeter percentage around 20%, and Interior 30%. Thus, if using these percentages, the vast majority of players would appear to be very scoring-centered. My concern here, in constructing a useful index, is to identify player propensities relative to other players, and for that, I calculate the percentile of each player’s percentages.

  • Scorer index = percentile(Scorer percentage)
  • Perimeter index = percentile(Perimeter percentage)
  • Interior index = percentile(Interior percentage)

Thus, even though the maximum Scorer percentage in a season might be close to 75% while the maximum Perimeter percentage is closer to 25%, the players with the highest percentages in the sample under consideration will be assigned an index value of 1. Players with median values on a percentage will have an index value of 0.5, and so on. The percentilization normalizes across style tendencies and player subpopulations, and has the added virtue of scaling from 0 to 1.

Interpretation

Thus we have a set of three numbers for each player which can be used to characterize his playing style. The numbers easily translate to more qualitative descriptions. A player with a SPI triple of (0.8, 0.2, 0.7) is an interior scorer, without much perimeter production. A player with this triple (0.1, 0.7, 0.75) is anything but a scorer, sometimes called a “glue” guy. Someone at (0.5, 0.5, 0.5) produces the league median of each type, which is different from a player whose percentages are 33%, 33% and 33%. Such a player would have a relatively lower Scoring index, for example.

Since each individual is characterized by three variables, their SPI type can be plotted in three dimensions. Unfortunately, three dimensions are difficult to convey on a computer screen, so here is a plot which depicts Perimeter indices along the X-axis, Interior indices on the vertical axis, and Scoring indices as the size of the point.

(Click to enlarge)

Historical application note: Since steals and blocks have not been kept for the entirety of the history of professional basketball, players from earlier eras may have slightly skewed SPI values. While percentages and indices can still be calculated based only on fga, tr, and as, it is not difficult to see that leaving out blocks and steals, in comparison to eras in which those defensive statistics are included, will tend to skew players from an earlier era more toward the Scoring type. Unfortunately, without substantial era-specific correction, this effect is unavoidable. However, the sorting still manages to work well, especially if this detail is kept in mind when making certain cross-temporal comparisons.

Presentation

One of the advantages of using three sub-indices to construct the overall SPI Trichotomy is the convenient translation of index values to color. The three primary colors of light are Red, Green and Blue, and when combined in certain proportions, it is possible to generate infinite gradations of color (see Wikipedia). This means that each SPI triplet for each player can be represented as a single color. This aids understanding and comparison, as it is much easier to keep in mind that a certain player is a deep red than that his SPI triplet is (0.9, 0.1, 0.2), or that a player is a medium grey than that his triplet is (0.45, 0.53, 0.55). Further, a greenish-blue player is easily identified with another greenish-blue player, without having to specifically compare each of the players’ three index values. The human eye is capable of extremely high-resolution discernment, and using a single color to represent three numerical values takes advantage of this.

Here is the above plot, with color added according to RGB values derived from each player’s SPI indices, as you can see, “blueness” increases from bottom to top, “greenness” from left to right, and “redness” varies with the size of the point. The top-right corner is aqua or cyan, while the bottom left is mostly reddish, due to an absence of green and blue.

(Click to enlarge)

Unfortunately, this presentational format leaves a lot to be desired. Since each player can be represented by just one color, can we do better than a pseudo-3-dimensional plot? The answer is yes and no: No, because to ensure that the hue, saturation, and value of each color are captured, we still require three variables (see Wikipedia); yes, because most of what we are interested in here is hue–the underlying color for each player, red, yellow, green, aquamarine, vivid tangerine, indigo, etc. The other two components of HSV color space, saturation and value, allow us to see how “pure” the hue is, which in our basketball application, translates to how “pure” an individual’s playing style is.

Playing style as a continuous spectrum

Using polar coordinates, we can plot each player’s position in a continuous spectrum of playing styles. Each individual may be represented as a vector, with Hue translating to direction/angle and Saturation+Value translating to magnitude/distance. The angle of the vector indicates the player’s style, and the magnitude of the vector indicates the “fit” of that player to that style–that is, since it is unlikely any given player’s statistical profile will assign him perfectly to a given category, there is a level of fitness that captures the extent to which they do. Very rarely will a player have some assists and steals, but no blocks, rebounds or field goal attempts, which would give them a P index of 1, but S and I indices of 0. Because of this, rarely will any player be a pure green, or pure blue or red. The degree to which they are a mixture of styles/colors is captured somewhat by their fit.

We can describe a player’s style by their SPI indices, or by their color, but we can also describe them according to their angle, which is most easily communicated by referring to positions on a clock. In the graphic below, the top of the circle can be thought of as 12 o’clock, the far right translates to 3:00, the bottom is 6 o’clock, etc. This is yet another way to describe style more easily than by referring to the player’s SPI triple, but more accurately and consistently than by descibing color. Finally, I have assigned arbitrary descriptive names to each of six major “spokes” on the diagram, which should help the uninitiated translate commonly-used adjectives into positions on the clock. Here is a listing of SPI indices, fit, clock positions, and shorthand labels for each player in the 07-08 season, as well as 500 all-time greats.


Graphical Display

Below is a graphical depiction of the SPI Playing Style Spectrum, with the positions of 250 of the NBA’s all-time best.


Fullscreen Version

As you can see, the SPI typology encompasses Mr. Ziller’s point guard continuum, and much more. “Small two-guards” (exemplified by Barbosa, Ellis, Terry and Iverson) line up at about 1 o’clock; “Combo guards” mostly fall between 11:30 and 12:30; “Pass-first points” even more to the left; “Pure point guards” are seen at about 11 o’clock. The spectrum continues, however, to more defensive/bigger guards, more well-rounded perimeter players, point-forwards, glue guys, defensive stoppers, big men, widebodies, power forwards, pure scorers, and back to shooting guards.

One interesting use of the spectrum graphic is to make comparisons. Unsurprisingly, Kevin Johnson and Steve Nash have similar styles; Kobe Bryant and Michael Jordan are in close proximity; and Tim Duncan and David Robinson filled almost exactly the same role for the same team. It’s also interesting to make comparisons across eras: Dennis Rodman/Bill Russell, Vince Carter/Rick Barry, Michael Jordan/Jerry West, Magic Johnson/Jason Kidd, etc. It’s also possible to identify stylistic opposites: Chris Paul-David West, Shaquille O’Neal-Kobe Bryant, Allen Iverson-Marcus Camby, etc.

Here is a SPI plot for just the 2007-08 Season (note that player names are represented in abbreviated form):


Fullscreen Version

Thus far, the SPI typology is useful mostly as a classification system, but if you’re interested, I’ve spent some time looking into the relative value of certain types, as well as their interactions. There’s much more to be done in this vein, but some of the initial findings have been interesting. (APBRmetrics discussion)

Conclusions

Evidently, it’s possible to develop a comprehensive classification system of playing styles using statistics alone. Now that the SPI color scheme has been introduced, you might find it interesting to refer back to the graphics I presented last week, in which I’ve applied the scheme. It adds a dimension of information to the season and team history graphics. I’d be very interested in hearing your thoughts in the comments, as well as in the obligatory survey below.

NEW! I’ve just created desktop wallpaper-sized All-Time Great SPI Graphics. Download them and enjoy! [1024 x 768] [1280 x 1024]

The Arbitrarian: Individual Contributions To Team Success

Note for those viewing this post in a feedreader: today’s Arbitrarian is very graphics-heavy, and the images won’t show up in the feed version of the post. If you want the full experience, I suggest clicking through to see the original at HP.

Last week, we looked at player productivity, as based on box score output. Today, we’re going to look at value, which, as you will see, is somewhat different than productivity.

The MVP is not necessarily the best player in the league, nor the most efficient. Many times, the MVP award goes to the player widely considered to be the best player on one of the better teams, but when it comes to arbitrating between comparing the best players on several of the best teams, there appears to be no hard-and-fast rule, and subjectivity enters into play. Today, I will propose that value ought to be quantified in terms of individual contributions to team success, where success is measured in wins.

If we can estimate the number of wins for which each player is responsible, we can do away with the arbitrary focus on only the best few teams. It is theoretically possible, for example, for the most valuable player to be an absolutely dominant but lonely contributor on a middling team, while the better teams each have enough decent players that no single one can be credited with a large portion of their success. We are still, however, left with the problem of objectively measuring each player’s contribution to team wins. To do this, I’d first like to explore…

A non-basketball thought experiment

Imagine a lemonade stand owned and staffed by Xavier, Yvette, and Zach. They make money by selling home-brewed lemonade at the end of their cul-de-sac, and only one of them staffs the stand at any given time. After their first month in business, they look at their lemonade sales revenue, and try to figure out which salesperson deserves what part of the income. One option would be to split the revenue into thirds–three employees, three parts. Zach claims that such a distribution is unfair because he worked over half of the total number of hours, while Yvette and Xavier worked about a quarter of the hours each. He claims that the distribution should thus be more like (1/4, 1/4, 1/2).

Xavier points out, however, that if they are trying to assess each employee’s value, they should try to find a more specific measure of actual revenue generated by each seller. He suggests that, since revenue is generated by lemonade sales, revenue generation should be measured in terms of the number of lemonades sold by each employee. Since they kept detailed records of such numbers, this is easy to calculate: Xavier sold 2/5 of all glasses, Yvette 1/2, and Zach just 1/10. Zach is disappointed that his pay-per-hour gambit was foiled, but must concede that this arrangement is more just–Yvette and Xavier are much better salespersons, and did more to help the company make money, while Zach mostly daydreamed during his hours on the job.

Back to basketball

What I have in mind is the application of a similar methodology to basketball. We have an excellent estimator of aggregate value–team wins; and credit for these wins can be apportioned to the players who work for those wins. There are a plethora of ways this could be done–we could arbitrarily estimate credit for each player on each team: The superstar might get 50% of the credit, the rest of the starters get 10% of the credit each, while the remainder is split amongst the bench players. Perhaps we could look at minutes played–after all, ceteris paribus, removing a mediocre player, and replacing him with a better player for the same number of minutes, should result in a greater number of team wins. Similarly, increasing the number of minutes played by a good player (to a point) should increase wins, while increasing minutes played by a bad player should lead to fewer wins.

This method isn’t foolproof, however: certain high-minute players might be daydreamer-types like Zach in the example above, while others might be feverishly productive. Consider, for example, Matt Carroll versus Yao Ming in 2007-08. Both played roughly the same number of minutes (2016 and 2044), but Yao was substantially more productive (by almost any measure) than was Carroll in that amount of time. Estimates of their value should reflect this difference.

Instead of minutes, I have chosen to use Model-Estimated Value (or MEV, discussed here) as an estimate of player productivity. There are several advantages to this choice, but two stand out. First, as discussed previously, MEV is a good estimator of per-game productivity, and so is more helpful to us than looking at, say, games played, minutes played, or points scored alone.

The second advantage comes from the fact that MEV does not perfectly capture player value. If it did, then team-level MEV would correlate perfectly with team wins, and we would not need separate measures for productivity and value. Rather, since some aspects of player value are omitted from the box score–things like defense, effort, intensity, etc–we may scale our MEV productivity estimates by team success, which does implicitly measure all of each player’s contributions.

Bruce Bowen, for example, had a 07-08 per-game MEV of 5.60, which put him below, among others, Wally Szczerbiak. Many would cite this as an example of the failings of MEV–its inability to fully measure defense (not to mention a lack of adjustment for playing time and pace) leads to an undervaluation of players like Bowen. However, Bowen’s defense does show up in the Spurs’ success–no small part of their winning can be attributed to his contributions. Similarly for Szczerbiak–his contributions are reflected in the success had by Cleveland and Seattle–that is, relatively little success. Thus, by crediting players for team success, using MEV as our measure of productivity, we may get closer to measuring each players’ actual value.

This method is still not perfect. We might still be undervaluing Bowen’s relative contribution to the Spurs, and overvaluing Szczerbiak’s contribution to his teams. However, given two players with idential MEV numbers on teams with otherwise identical rosters, the player whose MEV is “worth more” will help his team win more games.

Calculation and results

The measurement of each player’s value to their team is straightforward. Merely take each player’s season total MEV for a given team, and divide it by that team’s season total MEV. This gives us a metric I call Percent Valuable Contributions, or PVC. For Kevin Garnett in 07-08, this calculation takes his season total MEV (1,534.19) and divides it by that of the Celtics as a whole (8,282.62), resulting in a percentage (expressed as a decimal): 0.185. This means that Garnett is responsible for 18.5% of the Celtics’ success, which is a rather large portion, indeed. From here, estimation of value is very easy. Simply take this PVC number, and multiply it by team wins. This gives you each player’s BoxScores (BXS), their individual contribution to team success. The first tab on the table below depicts the numbers that go into the BoxScores calculation for each player in the 2007-08 season.



I’ve sorted each player by their PVC for the sake of comparison. In terms of value to their team, the top three players are James, Paul, and Jefferson. All three are good players, but certainly Jefferson is a step below the other two. Since BoxScores accounts for team success, we can clearly see Jefferson’s actual value is much less than that of James and Paul–he might be the most valuable player on the Timberwolves roster, but such is not a high distinction.

For more insight, note the series of players whose PVC comes it at around 0.185: Steve Nash, Joe Johnson, Kevin Garnett, Richard Jefferson, and Carlos Boozer. Even the casual fan knows that these players are not all equally valuable, though they may be equally valuable to their respective teams. Note that Nash and Boozer generated a much higher MEV total than the other three–this is largely because, as the table also shows, Phoenix and Utah had much greater team MEV totals, thanks to a faster-paced playing style. Team wins complete the picture–Richard Jefferson was responsible for 18.5% of his team’s 34 wins. His value is thus estimated at 6.29 BXS. Kevin Garnett was responsible for 18.5% of his team’s 66 wins. His value is thus 12.23 BXS. As you can see, by accurately measuring productivity (MEV), and accounting for team success (wins), we are able to objectively assess each player’s value (BXS).

An aside into rated productivity

Another useful measure, especially for comparing players on poor teams, or those who played limited minutes, is what I call the Valuable Contributions Ratio (VCR). This is a pace- and playing time- adjusted metric of productivity assessed at the per-minute level. As above, this calculation is straightforward and intuitive. Merely take each player’s PVC (MEV/team MEV) and divide it by each player’s percent of team minutes played (min/team min). Thus, we are dividing a percentage by another percentage (which is why I call it a ratio–units are somewhat meaningless). This statistic controls for team pace and playing time, and is independent of team quality–it captures productivity relative to the time allowed for production.

This is useful for comparing bench players, players who miss a substantial number of games, and rookies. Bench players get a “fair shake” by this statistic, because they often have less time on the floor in which to accumulate MEV toward a larger cumulative share of team success. Same for injured players–Andrew Bynum did not play very many games for the Lakers in 07-08, and as such was less valuable in terms of team wins. However, when he did play, he produced very efficiently, with a VCR of 1.36. (This means that he was responsible for 1.36% of his team’s production for every 1% of team minutes played–which is very efficient.) VCR is useful for comparing rookies, as well, since they often play relatively few minutes, and since their teams often win very few games. Rookies with high BXS are the most impressive, but more often than not, rookies don’t produce many wins. Rather, they may produce MEV efficiently, and we can see this in VCR. Among rookies with substantial playing time in 07-08, Carl Landry produced the most efficiently, with a very respectable VCR of 1.39. (The Arbitrary Rookie of the Year, Kevin Durant, was 7th among rookies by VCR, and 5th on the rookie BXS list after Scola, Horford, Moon, and Thaddeus Young. He did, however, lead all rookies in points per game. WoW Club!)

BXS MVPs and all-time greats

Look again at the table above, but this time, select the second sheet, titled “07-08 BXS.” This lists each player from the 07-08 season, on each team for which he played, and includes measures of productivity (PPG, and MEV/gp), efficiency (VCR), and value in terms of BXS. As you can see, the obvious most valuable player in 2007-08 was Chris Paul, who was responsible for over a quarter of his team’s 56 wins, for a BXS of 15.41. Kobe Bryant, this year’s Arbitrary MVP, had a good showing as well, but was responsible for almost three fewer wins than was Paul. Kevin Garnett, who was thought to be a more subjective favorite for MVP, acquitted himself nicely in objective terms, by generating 12.23 wins even while missing eleven games.

For a more historical perspective, see the third tab, “BXS Seasons.” This lists the same information, but for the 500 most valuable seasons from the population of every professional basketball season since the beginning of the NBA, even including the ABA. Unsurprisingly, Chamberlain tops the list, although his most valuable season was not his most productive in MEV terms. Shaquille O’Neal’s dominating performance for the incredible 67-win 99-00 Lakers is the most valuable season in recent memory, followed closely by Jordan’s post-first-retirement 72-win season in Chicago. Perhaps surprising, but perhaps not to those who have always appreciated the Big Ticket, is Kevin Garnett’s extremely high value in 03-04. Always a valuable player to his team, Garnett and the Timberwolves finally put it together for one great year, and Garnett’s relative value (PVC) translated to absolute value (BXS).

The final tab, “BXS Careers,” accumulates the performance of 500 NBA greats. The table is sorted by BXS82, which is the number of wins each player would be expected to produce in 82 games played, given his career performance. There may be a few surprises, but they are instructive: The first is Alex Groza, who was a great player in the early years of professional basketball, but whose career was cut short. The second surprise might be the ordering of Michael Jordan, relative to Magic Johnson and Tim Duncan. Many fans and observers would identify Jordan one of the most, if not the most, valuable player ever, and here he ranks sixth at a per-game level. The first thing to note is that Duncan has always been more valuable than his box score statistics might indicate, and this is reflected in his BXS measure. Secondly, Duncan has not yet seen his productivity or value decline substantially due to aging. For the most part, Duncan’s 13.73 BXS/82 average comes from the peak of his career. Johnson retired after a relatively short career, and his comeback in 1996 was brief. Jordan, by comparison, had a second comeback for Washington during which he played 142 games of much less valuable basketball. If you exclude Jordan’s Washington years, his career BXS82 becomes 14.62, which puts him solidly above the other two.

Visualizing value

Now for the first graphical visualizations in the life of this young column. Since BXS is derived by multiplying player contributions (PVC) by team success (team wins), we can envision BXS itself as the area of a rectangle with sides of PVC and Wins. This lends itself to graphical expression, with the league as a rectangle, 41*30= 1230 wins wide, and 100% high. Partitions may be made on the horizontal axis for each team, scaling each section by that team’s number of victories. Within each team’s segment, further divisions can be made for each player, according to their contribution to team success. The best explanation is an example, displayed below:


Fullscreen Version

You may use the controls at the top left to zoom in and pan across the graphic to see more detail, or an expanded overview, as you wish. The “Fullscreen Version” link directs you to a much larger version of the same display, which may be easier to grok. The graphic above displays BXS for the 2007-08 season, with team success increasing from left to right, and player contributions increasing from bottom to top. Colors are derived from statistically-derived playing style, where red indicates a propensity for scoring, green denotes perimeter play, and blue highlights interior-play tendencies. Much more will be said about this classification scheme next week.

Several things I’d like to point out in the graphic above to get you started: First, note how tall the rectangles of Chris Paul, LeBron James and Al Jefferson are–this is because height is scaled according to PVC, and these three players were the most responsible for their teams’ success.

Looking across the top row of the graphic, we can identify each team’s Most Valuable Player. For the lowly Heat, Dwyane Wade was most valuable, despite injury. Calderon was most valuable in Toronto (7.1 BXS), though Bosh was a very close second (7.0 BXS). The most valuable player on the best team was Kevin Garnett, but since he had a very supportive team behind him, his individual value was somewhat less than that of Chris Paul, whose supporting cast drops off substantially in terms of contributions after Stojakovic.

One useful perspective granted by displaying contributions in this manner, is that it is easy to compare units across teams. For example, Boston was famed for the Big Three of Garnett, Pierce and Allen. Using the scale on the right of the graphic, we can see that together, these three accounted for almost half of Boston success. Looking across the graphic from the lowest part of Allen’s rectangle, however, we can see that the big three that was most valuable to their team can actually be found in New Orleans, where Paul, West and Chandler can be credited with almost 60% of the Hornets’ success. On the other side of the coin, Detroit, Houston, and Chicago all got a fairly balanced set of contributions, as their subjective reputations might have suggested. I would be very interested to hear about your own observations, as well as your opinions as to how well this graphic meshes with your subjective opinions, in the comments.

I’ve also developed an interactive presentation of the graphic above, with even more detailed statistics. Just follow the link below to the Interactive BoxScores Explorer page. The league-wide graphic has been scaled to fit in your browser window, and players’ statistical details pop up on mousover. Try it–it’s somewhat addictive.

Make sure to click around a little bit–I’ve created “player cards” for each individual, which display even more detailed statistical information, including their playing style, most and least similar players, the mean and standard deviation of their “counting” statistics, and a season-long sparkline of their productivity. Feel free to use them in any application you wish.

The player cards are a quick and easy way to quickly assess any player. For fantasy purposes, for example, if you’re comparing two players with similar averages in assists, you might want to pick the player with the smaller standard deviation about that mean, as indicated by the error bars in the middle section of the card. Alternatively, if you are interested in whether a certain player tends to produce more as the season goes on, the seasonal trends should give some insight into this, as well as how long it takes the player to recover fully from injury, or how much they produce when their minutes go up. Also, I’ve included each player’s most- and least- similar match, based on the 07-08 season, which can help give you an idea of the niche they fill on their team.

Historical BXS franchise timelines

Another excellent use of this BXS area diagramming visualization is to display franchise histories. Better years are represented by wider segments, and the best players rise to the top with tall rectangles. Eras can by identified by patterns in color. Here are two examples, the first depicting the LA Lakers franchise, and the second, Boston Celtics history:


Fullscreen Version

Several eras stand out in the graphic above. The Mikan era was eventually replaced by the Baylor/West dynasty which became the Chamberlain/West years. Notice, incidentally, how West becomes “greener” over the course of his career, indicating a shift away from focusing on scoring, and toward a concentration on other perimeter contributions. A Kareem era follows, though his best years are at this point behind him, and massive team success comes only with the addition of Magic Johnson. Johnson leads the Lakers for ten consecutive years and his retirement marks the end of an era of dominance. LA returns to form in the late 1990s in the hands of O’Neal and Bryant, who turn in some incredible performances–interestingly, there is an obvious breakpoint between 2001 and 2002 on the graphic indicating the switch from the Lakers being “Shaq’s team” to being “Kobe’s team.” The 2006 version of Bryant was forced to carry the scoring load to a massive degree, but the 2008 version (as is evidenced by a much less red color), has been freed up to focus less on point production, and more on doing other things to help his team win. Perhaps the 08-09 season will see Gasol and Bynum float to the top of the column, turning in full, healthy seasons for a very successful LA team.


Fullscreen Version

Boston history is marked even more clearly by the careers of its greatest players. The Celtics of the 1960s are consistently topped by Bill Russell’s defensive-interior blue, and bolstered by some great scorers, like Havlicek and Jones. The 1980s saw a parallel to the Lakers above, in which a perimeter player (here Bird) lead a team supported by strong interior scorers. The 1986 and 1987 Celtics offer an interesting starting lineup of all greens and blues–no single player was responsible for most of the scoring, while other types of contributions were made by all. After years of seasons with little success and narrow columns, the Celtics finally turned it around last season with the addition of Garnett and Allen, almost tripling 06-07′s win total.

The final graphic below offers an alternative take on presenting BoxScores, by tracing the careers of each of 50 NBA greats. Following the peaks and valleys of each player’s tenure, we can also see the years in which many stars were shining brightly. 1972 was a great year for the sport, as was 1990–many of the NBA’s greatest players had good seasons in these years, and the league may be seen to peak at these points. The graphic makes it possible to see the beginning of new eras–witness the start of Bird’s and Johnson’s careers, followed shortly by the rookie seasons of Thomas, Drexler, Jordan, Olajuwon, Stockton, and Barkley. We can see a big dip in strike-shortened 1999, and then another in 2004. This second dip may be troubling–has the quality of the league declined so sharply? Worry not–many modern greats retired in the years just before 2004, meaning that their layers drop out of the picture, setting the stage for the new era of NBA stars we are witnessing today.


Fullscreen Version

Conclusion

As always, I’d very much like to hear your opinions of BoxScores as a measure of value, as well as whether or not you think it gets things right. Was Magic really at his peak in 1987? Were Garnett and Pierce in 2008 in the same league as Bird and McHale in their prime? Should Chris Paul have been the MVP this past season? Were the Rockets really the most balanced of the good teams last year, and will the addition of Ron Artest make them that much more indomitable? Please feel free to leave a comment and take part in the now-customary brief survey below. Next week, I’ll go into much more detail on a new way to describe players, without having to use all those pesky words.

PostScript: Discussion of the “50-Win Standard”
I hope you took the time to read Josh Tucker’s excellent discussion on the established precedent of giving the MVP award only to players on teams with fifty or more wins. I have a few thoughts on how the 50-win minimum precedent fits in with the BoxScores methodology I’ve established here.

The first is that I essentially agree with the implied criteria of such a cutoff (or the implementation of the “Bryant-Nash Rule”). That is, I think that value should determine the MVP, and value is measured in wins, not strictly in individual statistics.

However, as an Arbitrarian, I would tend to shy away from establishing an arbitrary (though precedented) line of demarcation between those who should and should not be under consideration. I think that if you put Wilt Chamberlain on a team with 11 kindergartners, and that team won 41 games, I’d want to consider Chamberlain for MVP. That is, there should be a sliding scale, in the sense that each of the Detroit Pistons individually are less valuable than a single LeBron James, though the Pistons collectively tend to do better than do the Cavaliers collectively.

This is built right into the estimation of BoxScores: A player contributes X% of the production for a team with Y wins, and so he is credited with X•Y of those wins. Fortunately, from the standpoint of the 50-win precedent, as team wins decrease, it gets harder and harder for any player to outproduce a player on a 50+ win team.

For example, imagine a season in which player A contributes 20% of the production for a 50-win team. (20% is on the low end for MVP-candidate PVC, and 50 wins is at the low end for a contender as well, so this is a conservative estimate for an MVP frontrunner.) Such a player accumulated 0.2*50 = 10 BXS. Player B, let’s say, is on a 41-win team. In order to be more valuable than A, B would have to be responsible for 10/41 = 24.4% of his team’s production, which is very high, indeed. In 07-08, only two players had more than a quarter of their team’s valuable contributions, and one of those was the BoxScores MVP, Chris Paul (whose team had 56 wins). LeBron James was the other, and despite contributing more than 2/7 of his team’s production, Cleveland’s win total of 45 reflected James’ second-most-valuable status.

In sum, even if we use BoxScores as our measure of value, it is highly unlikely (although not impossible) that the MVP will come from a sub-50-win team. The precedent will likely remain intact.



The Arbitrarian: Marginal productivity of box score statistics

David Sparks is the contributing statistics writer for Hardwood Paroxysm. His Arbitrarian column runs every Thursday here at HP. For more of his work, you can read his blog. This week’s entry is indeed a true stats column, and is probably the first post on here in a while that doesn’t have the words “snake eggs” in it. David’s our classy guy. This week’s discussion is on his own work with Box Scores. Enjoy.

Thus far, you’ve gotten to read me wax philosophical and discourse on the ideas of others. Today, I’m going to enter the arena, so to speak, and present some of my own work.

Imagine for a moment that you’re interested not only in estimating player value (as in “Most Valuable Player,” not the best player, nor the most talented, nor the most clutch, etc. Value is a direct function of productivity, not ability.), but estimating it well, and doing so for essentially all of professional basketball history. Perhaps you could use (adjusted) plus/minus? Well, no, unfortunately, the play-by-play data necessary to construct plus/minus goes back only a few seasons–no one was keeping track of Bill Russell’s on-court versus off-court team scoring totals.

It would be nice if we had a good way of measuring defense, other than just blocks and steals–maybe it would be possible to pore over video of every game ever played and count the number of “shots changed” and “ball-handlers pressured” for each player… except I’m not sure if video would be available for all 50,000+ games played. Last week, when I asked if there was still room for development in NBA analytics, the overwhelming response was “yes” and the second most overwhelming response was “Defense!” Apparently, it is well-known that box score stats fail to capture some of what makes a player a good defender. Two commonly-cited examples of good defensive players undervalued by traditional statistics are Shane Battier and Bruce Bowen; both are often assigned to guard the opponent’s best perimeter player, but judging from box score statistics alone, it might be hard to see why.

If one is interested in historical comparison, the data options are somewhat limited. Even certain box score stats, like steals, blocks, three-pointers (which are a relatively modern addition to the rulebook), and offensive/defensive rebounds have not been tracked for all of basketball history. However, I contend that for any season prior to roughly 05-06, box score-based metrics are the best option, given that they are essentially the only option. Further, what I propose here goes a long way toward indirectly capturing some “unmeasured” defensive ability, and though it may still be systematically biased against certain lockdown-type defenders, such players are (subjectively) relatively rare.

Defining value through productivity

I will go into much more depth next week on the topic of value, but for now, I will suggest that value is a function of productivity. In “counting stat” terms, basketball productivity can be seen as the accumulation of points, rebounds, steals, personal fouls, and so on, by a player or group of players. However, each of these possible production items is worth something different: a player who contributes 5 fouls in a game is certainly affecting the final score in a different way than a player who contributes 5 points in a game, ceteris paribus. Offensive and defensive rebounds might be differentially productive, as might be missed free throws and missed field goals. It should be fairly obvious to most observers of the game that merely “adding the good and subtracting the bad” is not an appropriate way to estimate productivity (See “Efficiency“), though it may be better than focusing heavily on scoring numbers alone.

That different box score contributions have different values is generally widely accepted; a problem arises in identifying the appropriate/actual set of weightings to use. Is an assist worth one-half of a point? How much more (or less) is an offensive board worth than a defensive rebound? The problem, as I’ve noted before, is that a statistic can be developed to support any conclusions you wish to find. Do you think that the Allen Iverson/Carmelo Anthony duo is the greatest of all time? Weigh scoring heavily relative to other contributions, and assign small (if any) negative values to missed shots. Think Mark Eaton and Dikembe Mutombo’s defensive prowess make them the best ever? Well, when you consider that a blocked shot prevents two points and may also give the blocking team possession, it’s really worth three times the value of a point–it all adds up. My point is that, intentionally or not, biases may easily slip into our analysis. This is why it is important to make public any metric-determining methodology, and subject it to review and criticism.

At any rate, I plan to construct a productivity metric based on a linear-weighting system not too dissimilar from that of Berri and Hollinger, although it differs in the exact weights, and makes fewer “adjustments.” Such linear systems are often criticized, but as I have outlined above, they are one of only a few options open to those with an interest in assessing the players of the past. Further, my value metric (as opposed to my productivity metric, if you’re still with me… there is a difference) incorporates more than just the linear-weighting system, as you will see next week. The key contribution I’m making today is to put forward what I believe to be highly significant, verisimilar linear regression results that help us find “true” weightings.

A data problem

I will not bore you with the details, but this is an endeavor I have attempted many times. Regression analysis allows us, in one interpretation, to estimate the marginal value (in terms of a dependent variable) of an additional unit of an independent variable, on average. For example, a model estimating baseball production might find that for every additional home run hit by a team, their runs scored total increases by 1.44. In baseball, regressing things like singles, doubles, triples, home runs, steals, ground-into-double plays, walks, etc. on runs scored works like a charm (maybe I’ll post analysis this if it’s a very slow news day, but I imagine the baseball metricians have already covered it).

In basketball, at the season level, such is not the case. Regressing box score stats on wins doesn’t really seem to work (by which I mean coefficients which “should be” positive come out negative, for example), nor does regressing on average point differential, points scored, points against, and so on. One option is to do as Dr. Berri has done, and develop a somewhat indirect, albeit reasonably convincing, system by which to connect individual player productivity to team success. (See his 1999 paper here.) Another option is to increase the resolution, and use game-level data:

Box score contributions to team scoring margin

Using a sample of tens of thousands of modern NBA game box scores, I set up a regression using the following formula¹:

MARGIN = B1 + ISHOME*B2 + MIN*B3 + UBX*B4 + FTX*B5 + AS*B6 + OR*B7 + DR*B8 + ST*B9 + BK*B10 + OUST*B11 + PF*B12 + OUBX*B13 + OFTX*B14 + OAS*B15 + OOR*B16 + ODR*B17 + OST*B18 + OBK*B19 + OUST*B20 + OPF*B21


Where:

  • MARGIN = Team total points scored less opponent total points scored
  • ISHOME = A dummy variable indicating whether or not the team of interest is playing at home
  • MIN = Duration of the game in minutes
  • UBFGX = Un-blocked missed field goals = team missed field goals less opponent blocks
  • FTX = Missed free throws
  • AS = Assists
  • OR = Offensive rebounds
  • DR = Defensive rebounds
  • ST = steals
  • BK = blocks
  • UST = Un-stolen turnovers = team turnovers less opponent steals
  • PF = Personal fouls
  • The “O” prefix indicates the same variable measured for the team’s opponent

This regression returns the following output:


Residuals:
Min 1Q Median 3Q Max
-21.90690 -3.57014 -0.04017 3.56452 22.23100

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.124810 1.113502 -1.010 0.3124
mp 0.009060 0.004918 1.842 0.0654 .
ishome 0.070037 0.073806 0.949 0.3427
tubx -1.059533 0.013189 -80.334 <2e-16 ***
tftx -0.606574 0.013822 -43.886 <2e-16 ***
tas 0.346423 0.007267 47.669 <2e-16 ***
tor 1.052038 0.015221 69.117 <2e-16 ***
tdr 0.531251 0.013246 40.107 <2e-16 ***
tst 1.580819 0.012076 130.903 <2e-16 ***
tbk 0.952582 0.016660 57.177 <2e-16 ***
tust -1.462616 0.014012 -104.381 <2e-16 ***
tpf -0.209380 0.009467 -22.116 <2e-16 ***
oubx 1.004537 0.013118 76.578 <2e-16 ***
oftx 0.567970 0.013906 40.843 <2e-16 ***
oas -0.352181 0.007303 -48.223 <2e-16 ***
oor -1.007247 0.015194 -66.294 <2e-16 ***
odr -0.491897 0.013352 -36.840 <2e-16 ***
ost -1.625631 0.011970 -135.807 <2e-16 ***
obk -1.009805 0.016927 -59.657 <2e-16 ***
oust 1.433541 0.013909 103.065 <2e-16 ***
opf 0.240950 0.009476 25.427 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.292 on 24689 degrees of freedom
Multiple R-Squared: 0.8492, Adjusted R-squared: 0.849
F-statistic: 6949 on 20 and 24689 DF, p-value: < 2.2e-16

Here are the coefficients, along with standard errors, expressed in graphical form:

Note that the standard errors are all pretty small (essentially invisible), and all of the coefficients are significantly different from zero.

To arrive at the weightings I use for my linear productivity estimator, I averaged the magnitude of the Team and Opponent coefficients for each statistic, resulting in the following weights:


tubx -1.0320351
tftx -0.5872716
tas 0.3493022
tor 1.0296423
tdr 0.5115741
tst 1.6032249
tbk 0.9811934
tust -1.4480786
tpf -0.2251647

The great thing here is that (almost) all of these weights seem to make perfect theoretical/subjective sense: A missed field goal is worse than a missed free throw, since many missed free throws are the first of two attempts and cannot be rebounded, and the shooting team is often in a better position to defend a missed free throw defensive rebound counterattack. Offensive rebounds are worth more than defensive, because though both indicate the capturing of a possession, an offensive rebound puts teams in a better position to score than a defensive rebound, which must be moved up the court and is subject to turnover and likely a more difficult shot attempt. A steal is worth (slightly) more than the typical turnover, because a possession change resulting from a steal probably generally results in an easier shot attempt than a possession change coming from, for example, an inbound from a three-second violation. Personal fouls, though they sometimes result in free throws for the other team (and are thus detrimental), are also often used to prevent an easy two-point scoring opportunity or to disrupt the flow of an offense, and are often employed for strategic purposes with the intent of increasing the fouling team’s score relative to that of their opponent.

The only theoretically problematic coefficient is…

The troublesome assist

I believe the regression results. Given the apparent verisimilitude of each other coefficient, I think that these estimates are reasonably accurate reflections of reality, and that each additional assist adds only 0.348 to the final margin, on average. However (subjectivity alert!), I do not think that an assist fully captures the contribution of the player doing the assisting. Not only are many good passes made on missed field goals, but some credit might be given to players for moving the ball up the court, running the offense, etcetera, above and beyond attribution for the single penultimate act of passing to the player who scores. Thus, since without such an adjustment, point guards are almost entirely absent from the upper echelons of the productivity list, I re-estimate the assists coefficient:

To do so, I regress team and opponent assists alone on final margin. Using the resulting coefficients (1.196574 and -1.188791, respectively), I take an average as done above, to find my operating coefficient: 1.192683.

Thus far, our coefficients allow us to approximate the number of points a player helped to create for his team, the number of points a player prevented his own team from scoring, the number of points a player allowed the other team to score, and the number of points he prevented them from scoring. To this, we add the most direct contribution to winning margin: points scored. Each player is credited with “all” of his points–there is a direct, one-to-one relationship between each additional point scored and final scoring margin. Thus, I give you an elegant linear-weighted box score-based productivity metric, Model-Estimated Value:

MEV = pts – 1.032*fgx – 0.587*ftx + 1.193*as + 1.030*or + 0.512*dr + 1.603*st + 0.981*bk – 1.448*to – 0.225*pf

Note: In past seasons, offensive and defensive rebounds were not recorded separately. Thus, for such years, I replace the OR and DR factors with (total rebounds) * 0.669, which is the weighted average of the value of all rebounds since offensive and defensive boards have been counted as distinct. Also in years past, turnovers, blocks, and steals were not tracked. I feel that it would be inappropriate to impute estimates of such statistics for historical players, and so I am more or less content to allow no penalty for all unrecorded turnovers past, nor give credit for uncounted blocks and steals. The value metric I’ll detail next week should make this a more comfortable accommodation.

Pre-emptive rebuttals to likely criticism

Certainly this metric is not perfect, and there are many criticisms which could be leveled against it. Here, I will try to address some likely concerns, while avoiding straw men.

C: MEV is box score-based, and so fails to adequately capture, among other things, defense, hustle, heart, desire, clutch, etc.

R: I tried to address this to some extent in my preamble above. I would be happier if MEV did a better job of capturing all aspects of the game (especially defense, though the enhancement I detail next week, I feel, helps somewhat), but given data restrictions, I have decided that box-score statistics are an evil necessary to a universally applicable estimator.

C: A steal (rebound, three-pointer, turnover, etc.) in the last seconds of a close contest is worth much more than at another point in time, and is certainly worth more than in a blowout contest.

R: The first clause is highly debatable: a steal made in the middle of the second quarter might obviate the need for any late-game heroics, and all points scored are given equal credit in their accumulation toward the final score. The second clause is similarly misguided: any additional box score stat will contribute just as much to the final scoring margin, on average, in any game.

C: You keep saying “on average,” but there is no “average” blocked shot. Some are rebounded by the shooting team, some are swatted out-of-bounds, others prevent the game-tying shot, etc.

R: I say “on average” because that is what my methods permit me to say. Part of this is a data availability problem. Until the day we have exhaustive categorizations of every single event and its result, for all NBA games past and future, I am content to make do with the average. Further, over the course of many observations, the averages should not systematically bias the estimates in favor of, or against, any single player. Michael Jordan had many “significant” field goals, but he also had many less “significant” ones.

Incidentally, this argument is often proffered by those opposed to statistical approaches in general. It may indeed be true that some nuance is lost when dealing with recorded numerical observations of the game as compared to narrative, subjective observations. However, it is my contention that the gains in objectivity, accuracy, and consistency afforded by a statistical approach vastly outweigh the losses associated with the possibility that Big Shot Rob doesn’t get more credit for his Biggest Shots (in fact, he will get credit next week, to some extent). Further, as I have mentioned before, I do not see qualitative/quantitative approaches as a binary dichotomy.

C: MEV overweights/underweights statistic X, Y, and Z.

R: I have attempted here to be as transparent as possible in detailing exactly how I arrived at my estimates. I think there may exist some room for disagreement on some of the scalars, but I have detailed the reasons that these coefficients are theoretically satisfying, and empirically-derived. I would be willing to consider an argument with a sound theoretical basis and empirical verification (by which I mean, run your own regression), but for now, I am very comfortable with the weights as they stand.

The one exception is the value credited to an assist, which I may have under-justified. I do feel like (subjectivity alert again!) 1.192 is not an unreasonable amount of credit, falling as it does between the value of an offensive rebound, block, or missed field goal, and the value of a made two-pointer, turnover, or steal. Also, one would have to feel bad for all those point guards who spend all their time trying to pass instead of shooting, and hardly get any credit for it. Please, think of the point guards.

C: MEV should, but does not, account for pace, playing time, strength of opponent, and the quality of one’s teammates.

R: You are right that it does not, but next week I will deliver a pace-agnostic value metric. Further, I am interested in measuring productivity and value, not quality, ability, or technique (all of which are much harder to measure). Productivity per unit time will be addressed next week, but corrections for other players and teams, or positions played, have nothing to do with production. If the player scores a point, it matters not where he is, how big he is, or who else is on the court, it still adds +1 to the final margin. In the playoffs, when the stakes are high, and There Can Be Only One, a missed shot is still going to set your team back about 1.018 points. I may, at a future date, look into estimating quality or talent, but for now, I’ll leave that to my more subjective brethren.

C: Team-level MEV does not correlate well with team wins, and even if it does, that’s only because points are included.

Though MEV does correlate positively and significantly with team wins, this is not a relevant concern. It is directly derived from game-level team scoring margin, and teams only win games if this margin is positive. Further, next week, I will introduce a value measure which incorporates MEV and, at the team level, correlates perfectly with team wins.

The most productive

For those of you who have stayed with me, here’s the payoff. Using MEV, as derived above, we can estimate the productivity of every player who has ever played professional basketball. Here is a table of every player (each team played for) for the 2007-08 season, sorted by a commonly-seen value measure, points per game:

Now, click on the “MEV/G” tab at the bottom, to see the second sheet, which ranks each player by their MEV per game. The list changes fairly substantially. King James, who has a pretty well-rounded game, is still near the top. But Bryant and Iverson drop a spot or two, as do Wade and Anthony. Where does Kevin Martin go? Michael Redd? Richard Jefferson? Corey Maggette? Kevin Durant??? On the other side of the coin, here comes Chris Paul, Dwight Howard, Kevin Garnett and Deron Williams, to the top of the productivity rankings. Click on the third tab, “Value Added,” to see each player’s MEV less points scored, per game. This is an estimate of the non-scoring ways in which each individual helps his team and hurts the other team. Pass-first point guards, defensive-minded bangers, and well-rounded contributors rise to the top. Chuckers (see: Ben Gordon), often characterized by flashy scoring numbers, sink to the bottom. These players still contribute positively, through their ability to score, but their positive value is diminished by the number of shots they miss, turnovers they give up, and the other things they fail to do to help their team improve that final margin.

What if we expand our analysis to the careers of the NBA’s all-time greats? Below is a set of three tables, mirroring those above, except that it covers the duration of 500 of the NBA’s most productive playing careers, according to MEV.

Jordan’s and Chamberlain’s greatness is still validated by MEV; both players contributed through much more than just scoring. Other NBA legends, such as Bill Russell, Magic Johnson, and Oscar Robinson, however, are inadequately captured by their PPG numbers. Again, at the bottom of the Value Added barrel, we see some famous score-first players.

Conclusion

I hope you have found this loquacious discourse both interesting and convincing. I have attempted to develop a theoretical grounding for the appraisal of player value, and used empirical data to estimate a set of scalars with a high degree of face validity. I believe that much of the justification for the accuracy of this metric can be found in its application to actual players. Many individuals commonly known to contribute above and beyond their scoring ability are identified as such by MEV, while those whose points come at a cost are likewise singled out. It is my impression that this productivity estimator finds a happy medium, at which theory meets regression output; scorers are punished for missing, not for just shooting; and credit and blame are meted out fairly.

Please come back next week, when I will go into similarly lengthy detail about value estimates!

¹ This analysis is somewhat similar to that performed by Dan Rosenbaum in estimating statistical plus/minus. I encountered his work after estimating my own regression, and tend to prefer my variable choices and results, but in the interest of openness, I wanted to reference this prior work.

The Arbitrarian: A Statistical Primer

The Arbitrarian is the resident smart kid of HP. He sits in front, answers all the questions, always shows his work on the math homework, and volunteers to clean the chalkboard. We all make fun of him and then in thirty years he’s making four times what we do while we’re slinging shakes at the local Chik-Fil-A. His Arbitrarian Column runs every Thursday here at Hardwood Paroxysm. You can read more of his work at his own blog. This week he begins his column with a discussion of relevant basketball stats. -MM

In many academic articles, the author often begins by citing previous work that set the stage for what he or she is writing about. This is often called a “literature review,” and it discusses some of the strengths and weaknesses of past theoretical and empirical work, often with an eye toward explaining the need for their particular contribution. This post will serve as an introduction to some of the so-called “advanced” statistics, and I’m counting it as my literature review. Please forgive me if much of this is exceedingly basic or familiar to you–I hope that this single post can help most readers get “on the same page,” with respect to some more recent developments in statistical analysis.

Most basketball fans are familiar with what I call the “counting” statistics, which are just simple sums of the number of times each player or team was tallied something tracked in the box score. Minutes, points, field goal attempts, personal fouls, etc. all fall under this category. Another set of statistics which almost everyone uses are what I’ll call “simple ratio” statistics, wherein one counting statistic is divided by another. In baseball, people often cite batting averages, in basketball, we often see points per game, free throw percentages, even assist-to-turnover ratios. By-and-large, this level of sophistication is sufficient for most fans–scoring average is probably the single most highly-regarded estimator of player quality among the vast majority of fans, and to be sure, PPG correlates positively with productivity.

Somewhat less commonly seen is the use of per-minute statistics. Recognizing that some players, by virtue of playing longer minutes, have more opportunity to score, collect rebounds, etc., it is sometimes useful to compare statistical production at the minute-level, which allows “fairer” (in some sense) comparisons among, for example, bench players and starters, or point guards and centers (who typically play fewer minutes). Adjustments are also often made by position, with the idea that, for instance, shooting guards as a group aren’t in as good a position to rebound as are power forwards, and so a shooting guard’s rebounding prowess should be measured against others playing that position.

Volumes could be written detailing each permutation and variety of statistic, but I’ll include only one more specific example. At the team level, team success is closely related to the scoring differential–the average difference between a team’s and its opponents’ points scored. Also, teams that score many points per game are not necessarily the best offenses, nor are teams that give up many points per game necessarily the worst defenses. Due the differences in the pace at which various teams play, own and opponent scoring totals are not good indicators of the quality of an offense or defense. Rather, by estimating the number of offensive and defensive possessions each team sees in a game, analysts often look at “efficiency.” Offensive efficiency divides points scored by possessions (higher is better), while defensive efficiency divides opponent points scored by opponent possessions (lower is better). It turns out that this is much more useful than simple points for or points against averages on their own.

Standing on their shoulders

I originally planned on briefly listing some of the better-known basketball analysts, along with a little bit of background and a critique of their methodology. Fortunately for all of us, most of that task has been very competently accomplished already, at NBAStuffer.com’s “Analytics 101.” I would highly recommend perusing that extensive collection of links, to familiarize yourself with some of the work that is being done, as well as some of the very capable individuals involved. Also, I would like to direct you to a post at the APBRmetrics Forum for the link to and discussion of a comparison of many of the more widely-used “advanced” metrics. In fact, while at the APBRmetrics forum, take a look around… much of the debate going on there is on the very cutting edge, and it won’t take long to get a sense of the disputes that still rage: how can we measure defense? Does efficiency decrease with use? Are there diminishing marginal returns to player productivity? Etc, etc. The individuals posting in that forum constitute a large portion of the most capable and intelligent basketball analysts working today.

Since the basics have so thoroughly covered by others, I will briefly consider several of the most widely-used statistics, and offer my opinion on their strengths and weaknesses. It is important to note that I feel that the use of more than one approach (keeping in mind the strengths and weaknesses of each approach) permits a much more well-rounded and robust analysis of any problem. Further, my opinion is not authoritative on any of the material covered here, and I welcome a discussion on the relative merits of each methodology.

Plus/Minus (Raw, Adjusted, or Statistical)
Background: Rosenbaum (see also), Lewin (also), Witus, Ilardi

Pros
: Arguably the most computationally-intensive of the metrics I will discuss here, the plus/minus statistic, which is now being officially tracked by the NBA, has a lot to recommend it. One of the most useful aspects of this statistic is that it accounts for defense better than any metric based solely on box score statistics. The other nice thing is that a well-reasoned and well-applied statistical methods have been employed in converting between “raw” plus/minus, and “adjusted” plus minus, in order to control for the quality of a player’s teammates and opposition. Further, plus/minus figures are often counter-intuitive. This by itself is not always a good thing, but it may indicate that this particular measure tells us things about the game that other, more conventional methods keep hidden. As this listing indicates, many of the players we might expect to top the list are at or near the top, but the orderings are sometimes somewhat surprising (see “cons” below, however). There is also the added value of having separate offensive and defensive ratings, to identify those players who are especially undervalued by offensively-oriented box score stats. As it stands currently, Plus/Minus is possibly the best single-number estimator of a player’s influence on the game, at least for contemporary players.

Cons: Personally, I can offer very little to recommend against Plus/Minus, especially in its adjusted form. There are only two criticisms I can muster. First, you can see from the list linked above that each estimate is accompanied by an error term. (Unless I am mistaken,) This means that a player with an Adjusted +/- of 14 and an error term of 11 is 95% likely to have an actual +/- value between 3 and 25 (incidentally, it also means that there is a 1 in 20 chance that the actual value is outside those bounds, but this is conventionally disregarded). This example (Dwight Howard in 07-08) is a particularly egregious one, but it exemplifies the problems of relying exclusively on +/-. It means that, within the range of their error terms, it is difficult to identify the correct ordering of any set of players, much less their exact value in terms of points. It is, nevertheless, very instructive to review players’ +/- ratings–certain unheralded players, possibly undervalued by box score methodologies, often show up at the top of these lists, notably the Pistons’ Amir Johnson and Houston’s Chuck Hayes, who lead the league in defensive +/- rating last season.

My other quibble is more pragmatic than theoretical: at this point, the public only has access to a few season’s worth of plus/minus data. While this has very little to do with the validity of the statistic as an estimator of value, it makes historical comparison essentially impossible. Sadly, to the extent that one is interested in comparing players across eras, plus/minus becomes a less functional tool.

Player Efficiency Rating (PER)
Background: Hollinger (at ESPN), Wikipedia

Pros: My understanding of PER is that Hollinger developed the weightings he employs with a theoretical (rather than statistically-derived) basis (please correct me if I am wrong). This doesn’t necessarily make it better than any other metric, but it is at least a somewhat unique and thoughtful approach the problem of value assessment. Also, PER adjusts for pace, unlike many of the more conventional statistics with which we are familiar, and this helps control for the advantage held by players on “run and gun” teams, due to the greater number of opportunities they have to accumulate counting statistics. Another (arguable) virtue is that PER is assessed on a per-minute basis, which accounts for the disparity in minutes played across individual players and position types.

Cons: Hollinger himself admits that PER, as a box score-based statistic, fails to account for the type of defense that, while it may not produce a block or a steal, still prevents scoring. Notably, players like Bruce Bowen and Shane Battier, of whom it is often said, “his contribution didn’t necessarily show up in the box score,” may be undervalued by PER and similar stats.

Another minor complaint I might lodge is that PER, with its pace adjustment, per-minute rating, adjustments for team assists, and league rebounding correction, becomes more of a “rating” than a metric. By this, I mean only that while it is straightforward to compare one players’ PER to that of another, the statistic is somewhat decontextualized. What unit is PER measured in? How does it relate to scoring, or scoring prevention, or winning, or is there a direct relationship? Etc.

Finally, the decision to make PER per-minute, rather than per-game or per-season, while it does enable comparisons across players who play different minutes, comes with certain assumptions. Namely, when comparing “low-usage” players against those who play a substantial number of minutes per game, we must assume that efficiency does not vary with usage. In other words, if low-minutes player A appears to be more efficient (according to PER) than high-minutes player B, we must qualify such a comparison by saying explicitly “Player A, in the time he plays, is more efficient than is player B, in the time he plays.” We cannot say, without making additional assumptions about usage versus efficiency, that player A would be as efficient as B if he were playing the same number of minutes as B.

Wins Produced
Background: Berri, calculation

Pros: I think that Berri is essentially correct in his finding that scoring may be overvalued by “laypersons” — it is my subjective observation that scoring numbers are the most often cited by casual fans and media outlets alike in their discussion of the contributions of individual players and their relative value/quality. I have written up a very simplistic game-theoretic model in which, assuming that players want higher salaries (which I think is fairly easy to stipulate), and assuming that high-scoring players achieve higher salaries (which would need some argument and evidence), players have an incentive to eschew “team play” in favor of pursuing a high number of shot attempts. This theoretical argument would support some of the claims Berri makes.

Additionally, I agree with Berri’s use of a regression model to estimate coefficients for the weighting of box score statistics–I have made the same choice, as I will discuss next week. This seems, at least a priori, more methodologically sound than guesstimating values based on a theoretical argument, as Hollinger does (again, please correct me if I am wrong about this last statement).

Cons: There are almost too many counterarguments to the WP methodology to list here. In fact, since it has already been done so well, I will refer you to this topic at the APBRmetrics forum, where a lot of smart, analytically capable people tear into Berri’s work.

To this, I would only add a few specific criticisms: First, Berri’s model weighs rebounding extremely strongly–many would say he overweights the value of a rebound–and this leads to findings such as Dennis Rodman being more valuable (per-minute) than Michael Jordan. (Edit: Apparently, Berri modified his methods for the publication of WoW, and at such time, he identified Jordan as the better of the two. See this post.) Findings such as these have been roundly criticized by essentially everyone, but I am willing to concede at least the theoretical possibility that they are true. My main problem is that the author’s typical response to such criticism has been to refer to the econometric work performed in his book and various articles, claiming objectivity–in other words, Berri is just the messenger, the numbers themselves reveal the actual truth, and the actual truth indicates Rodman > Jordan.

This, to me, appears to be a cop-out. (Be advised that I have not read Berri’s book or articles, seen his regression output, or attempted to replicate his results–as such, my critique should be taken with a large grain of salt.) Others have suggested that Berri’s work fails the “smell test,” as in, its results are so illegitimate as to seem suspicious. The term I would use is that Berri’s model lacks “face validity;” it does not appear to measure what it purports to measure.

Further, Berri’s deflection of responsibility to the regression seems somewhat delusory. It is well-known to almost anyone who has performed such analysis that regression models can be fit to support almost any conclusion. I could show you, for example, an example in which the mere inclusion or exclusion of an intercept term in a model changes the coefficients of the other predictors from insignificant to significant. It is not my intention to “pick a fight” with Berri’s analysis, because his work appears very thorough and reasonably well thought-out, and I have not read it. However, passing the buck of responsibility for his results to the regression itself seems somewhat disingenuous.

Conclusion

I am sure there are substantial swaths of existing analytic literature that I have not covered here, as well as numerous names which I have not mentioned. Their exclusion was not intentional, except as I have only limited time to assess and discuss an essentially infinite body of work. I have elected instead to examine some of the more widely-known and used methodologies, with an eye toward familiarizing the “uninitiated” with some of the basics. I would also add the caveat that a wise person takes into account multiple sources of information and various perspectives in making any assessment or decision, and the truly wise see folly in trying to encapsulate the entirety of one player’s value in a single number.

Next week, I plan on introducing a novel value metric, which has some of the strengths and some of the weaknesses embodied in each of the above-discussed measures, and blithely commits the folly of distilling value into a single number. I hope I have made some progress here toward justifying the creation of yet another statistic, and if not, I hope you will indulge me, as that’s exactly what I’ve done.