Tuesday, January 15, 2013

NBA Minutes Breakdown for the last Decade+

You may be familiar with an earlier post I made a year ago looking at where NBA playoff round 1 minutes "came from". I decided to expand the idea here, mostly in order to accomplish a couple things. One, work on python and webscraping to see if I could get more used to grabbing a whole bunch of data from the internet, and two, see if I could make an interactive SVG. I'd say that it worked out for the most part. If you would like the raw data or code, just request in the comments and I could send it to you. I used python's native IDLE text editor though, so my files are sort of all over the place, and it might not make coherent sense. Unlike my previous post, here I am just going to deal exclusively with total minutes, rather than the sum-average. With this much data, I figured the effects of injuries would just end up leveling themselves out anyway. The data collection portion worked out pretty well, and I was a little like that picture of a guy holding onto a bunch of limes, except with numbers. It took a little work to decide which data to use and how. I have here more than a decades worth of data that I gathered.

By School

So here is a similar chart to what I had last time, looking at total minutes across different schools. Make sure you note that the image below is INTERACTIVE, at least if you are using a "modern" browser. So I can't really know what this will look like in older version of IE or different mobile browsers, etc. NBA playoff Minutes by College 50K 100K 150K 200K 250K North Carolina Duke Connecticut Arizona Kentucky UCLA Kansas Georgia Tech Florida Michigan State Michigan Georgetown Alabama Wake Forest Maryland Texas California UNLV Cincinnati Xavier Stanford Utah Villanova LSU Syracuse Regular Playoffs Total 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
NOTE: Seasons are coded by the year in which the ENDED. So the 2005-2006 season is described above as 2006 NOTE: The 2011-2012 season was a shortened season (66 games vs. the usual 82), so everyone should be down on average. That Arkansas-LR you see in the playoffs is basically Derek Fisher single-handedly putting that school in the top 25. Click around and have fun. I should also note that I relied on this guide as well as a bunch of google searches. I am actually not too familiar with javascript, so that was the most painful part of it. Also, if you would want to replicate something like this with a lot of data, I would recommend using a program to write the SVG for you and not to do it by hand (or even with Inkscape). I again used python to read the data I had gathered and transform it into an SVG.

By Conference

Again, I followed up on the same idea as my previous post and also did the numbers by conference. However, I used the CURRENT alignments as of this season of college basketball, eg. Maryland would still be ACC, etc. Use this ESPN page as reference if you are confused with all the realignment talk. All non division I conferences are grouped together. NBA playoff Minutes by Conference Black: Total Minutes Red: Adjusted to a 10-member conference 200K 400K 600K 800K 1000K ACC Big East Pac-12 SEC Big Ten Big 12 Atlantic 10 Mountain West C-USA Non-D1 Regular Playoffs Total 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

By Age

I also had the birth dates, so I tried to see if I could put it to use. Below we have the age distributions of the minutes played by players. I thought it would be an interesting question to ask whether there is an age distribution difference between regular season and playoff games. I made a double histogram to compare. Notice that the axes are different.

You see that there is a clear difference between the two distributions. I could have done some stats test here to compare, but it would involve a little reading to see what sort of model is appropriate, so I left it as is.

Heat Map

I also wanted to see if I could make a heat map over the US to see where these players came from. I used birth place (maybe not the most accurate thing), particularly because it was easy to get from the ESPN profile pages of the players. I did not, however, make this map myself. That would have been a whole lot more time spent learning how to do something I have no idea how to do. Instead I found that Bing (yes, Microsoft does some cool stuff sometimes) has a Heat Map generator. I had to find a way to convert Location names to latitude and longitude though, but there again I found a good resource and was able to do it in batches. These maps don't look all that great since they pretty much look like population density maps, but there are little nuances in there, in case anyone is interested. I should also note that I hate the idea of By-State Heat Maps, which seem totally pointless, especially if not normalized to population sizes. Anyway, these look ok.

By Time

I also made some line graphs for some of the top schools (and international and high school players). I was originally going to make another interactive SVG with this data, but I got burned out with the process. It involves a lot of trial and error, and I just wanted to get done with this project already. So here they are. Not Kentucky's funky jump in the last season here. Especially when you consider that 2011-2012 (the last year included) was an incomplete season, Kentucky's jump is especially note-worthy. There are some other interesting pieces of information.

1 comment:

  1. Nice work!

    If you're interested in making more interactive svgs, I'd suggest checking out d3.js

    ReplyDelete