My Ideas
Mostly stuff with numbers and maps.
Tuesday, September 17, 2013
Monday, August 26, 2013
European Club Football Map
FULL SIZE VERSION: Click Here
The map was made with Inkscape using information and crests/logos from Wikipedia. Some of the logos had to be retrieved from Google image searches in order to find images with transparent backgrounds. Team colors were matched to kit colors with ColorPick Eyedropper Chrome extension. I decided on stripes or gradients on an inconsistent basis. If you would like the SVG file to make your own edits, you can leave a comment below.
Following the effort of the College Football Map that I have posted on this blog, I decided to try my hand with the other football (soccer). I am fairly unfamiliar with the geography and fan spread of these leagues, but I tried my best with what information I could gather. I have been getting more into club soccer in Europe in the last couple of years and this kind of helps position the teams in my mind geographically, so it was a fun project. I am sure there will be plenty of complaints about the map (provided there will be people who actually see it), but I am open to making revisions of the map for greater accuracy. I even wouldn't mind the addition of other leagues, but I would need to do a little more research on that.
Before I continue I should mention that aesthetics were a big decision making criteria for this map, even at the cost of some accuracy. It is a given that making an accurate fan map would actually become pretty ugly and confusion. I instituted a could of rules to keep the map prettier (just like on the cfb version):
1) Continuous and regular shapes for each region. No disconnected regions
2) The regions should partition the whole map. No empty spaces or overlaps
3) Except smaller team with no region map (only a logo/crest stamp) create small overlaps
4) Avoid adjacent areas sharing the same background color. Switch to alternate colors if possible.
So yes, I know that it won't be absolutely accurate. An exactly accurate map would either be impossible or look hideous.
As far as inclusion of teams, I followed a two-tiered system.
1) For (physically) larger countries [England/Wales, France, Germany, Spain, Italy], include all teams which have played in the top level at any time within the last five years
2) For (physically) smaller countries [Portugal, Netherlands], include only teams which have played at the top level in all of the last five years.
I added Portugal and the Netherlands last and found it very difficult and muddled to try to fit a lot of teams into the smaller space. Even England and Italy were harder to work with.
The island in the square is Madeira and belongs to the Portugese league.
Following the effort of the College Football Map that I have posted on this blog, I decided to try my hand with the other football (soccer). I am fairly unfamiliar with the geography and fan spread of these leagues, but I tried my best with what information I could gather. I have been getting more into club soccer in Europe in the last couple of years and this kind of helps position the teams in my mind geographically, so it was a fun project. I am sure there will be plenty of complaints about the map (provided there will be people who actually see it), but I am open to making revisions of the map for greater accuracy. I even wouldn't mind the addition of other leagues, but I would need to do a little more research on that.
Before I continue I should mention that aesthetics were a big decision making criteria for this map, even at the cost of some accuracy. It is a given that making an accurate fan map would actually become pretty ugly and confusion. I instituted a could of rules to keep the map prettier (just like on the cfb version):
1) Continuous and regular shapes for each region. No disconnected regions
2) The regions should partition the whole map. No empty spaces or overlaps
3) Except smaller team with no region map (only a logo/crest stamp) create small overlaps
4) Avoid adjacent areas sharing the same background color. Switch to alternate colors if possible.
So yes, I know that it won't be absolutely accurate. An exactly accurate map would either be impossible or look hideous.
As far as inclusion of teams, I followed a two-tiered system.
1) For (physically) larger countries [England/Wales, France, Germany, Spain, Italy], include all teams which have played in the top level at any time within the last five years
2) For (physically) smaller countries [Portugal, Netherlands], include only teams which have played at the top level in all of the last five years.
I added Portugal and the Netherlands last and found it very difficult and muddled to try to fit a lot of teams into the smaller space. Even England and Italy were harder to work with.
The island in the square is Madeira and belongs to the Portugese league.
Tuesday, January 15, 2013
NBA Minutes Breakdown for the last Decade+
You may be familiar with an earlier post I made a year ago looking at where NBA playoff round 1 minutes "came from". I decided to expand the idea here, mostly in order to accomplish a couple things. One, work on python and webscraping to see if I could get more used to grabbing a whole bunch of data from the internet, and two, see if I could make an interactive SVG. I'd say that it worked out for the most part.
If you would like the raw data or code, just request in the comments and I could send it to you. I used python's native IDLE text editor though, so my files are sort of all over the place, and it might not make coherent sense.
Unlike my previous post, here I am just going to deal exclusively with total minutes, rather than the sum-average. With this much data, I figured the effects of injuries would just end up leveling themselves out anyway.
The data collection portion worked out pretty well, and I was a little like that picture of a guy holding onto a bunch of limes, except with numbers. It took a little work to decide which data to use and how. I have here more than a decades worth of data that I gathered.
NOTE: Seasons are coded by the year in which the ENDED. So the 2005-2006 season is described above as 2006 NOTE: The 2011-2012 season was a shortened season (66 games vs. the usual 82), so everyone should be down on average. That Arkansas-LR you see in the playoffs is basically Derek Fisher single-handedly putting that school in the top 25. Click around and have fun. I should also note that I relied on this guide as well as a bunch of google searches. I am actually not too familiar with javascript, so that was the most painful part of it. Also, if you would want to replicate something like this with a lot of data, I would recommend using a program to write the SVG for you and not to do it by hand (or even with Inkscape). I again used python to read the data I had gathered and transform it into an SVG.
You see that there is a clear difference between the two distributions. I could have done some stats test here to compare, but it would involve a little reading to see what sort of model is appropriate, so I left it as is.
By School
So here is a similar chart to what I had last time, looking at total minutes across different schools. Make sure you note that the image below is INTERACTIVE, at least if you are using a "modern" browser. So I can't really know what this will look like in older version of IE or different mobile browsers, etc.NOTE: Seasons are coded by the year in which the ENDED. So the 2005-2006 season is described above as 2006 NOTE: The 2011-2012 season was a shortened season (66 games vs. the usual 82), so everyone should be down on average. That Arkansas-LR you see in the playoffs is basically Derek Fisher single-handedly putting that school in the top 25. Click around and have fun. I should also note that I relied on this guide as well as a bunch of google searches. I am actually not too familiar with javascript, so that was the most painful part of it. Also, if you would want to replicate something like this with a lot of data, I would recommend using a program to write the SVG for you and not to do it by hand (or even with Inkscape). I again used python to read the data I had gathered and transform it into an SVG.
By Conference
Again, I followed up on the same idea as my previous post and also did the numbers by conference. However, I used the CURRENT alignments as of this season of college basketball, eg. Maryland would still be ACC, etc. Use this ESPN page as reference if you are confused with all the realignment talk. All non division I conferences are grouped together.By Age
I also had the birth dates, so I tried to see if I could put it to use. Below we have the age distributions of the minutes played by players. I thought it would be an interesting question to ask whether there is an age distribution difference between regular season and playoff games. I made a double histogram to compare. Notice that the axes are different.You see that there is a clear difference between the two distributions. I could have done some stats test here to compare, but it would involve a little reading to see what sort of model is appropriate, so I left it as is.
Heat Map
I also wanted to see if I could make a heat map over the US to see where these players came from. I used birth place (maybe not the most accurate thing), particularly because it was easy to get from the ESPN profile pages of the players. I did not, however, make this map myself. That would have been a whole lot more time spent learning how to do something I have no idea how to do. Instead I found that Bing (yes, Microsoft does some cool stuff sometimes) has a Heat Map generator. I had to find a way to convert Location names to latitude and longitude though, but there again I found a good resource and was able to do it in batches. These maps don't look all that great since they pretty much look like population density maps, but there are little nuances in there, in case anyone is interested. I should also note that I hate the idea of By-State Heat Maps, which seem totally pointless, especially if not normalized to population sizes. Anyway, these look ok.By Time
I also made some line graphs for some of the top schools (and international and high school players). I was originally going to make another interactive SVG with this data, but I got burned out with the process. It involves a lot of trial and error, and I just wanted to get done with this project already. So here they are. Not Kentucky's funky jump in the last season here. Especially when you consider that 2011-2012 (the last year included) was an incomplete season, Kentucky's jump is especially note-worthy. There are some other interesting pieces of information.
Subscribe to:
Posts (Atom)