Tuesday, November 22, 2011

College Football Map: Part III

Newest Version Here

Here is an alternate version, although at this point I am not sure what is more accurate. Big props to r/CFB which weighed in with some good knowledge and advice on the map. I did also use Common Census to influence my decisions where I was unsure, but did not rely too heavily on it. Most of the changes were in Upper Michigan, Dakotas and Montana, Florida, Texas, Georgia and NC.

The major change is the inclusion of some FCS teams. I used attendance data to include the most popular teams in FCS (since including them all would be insane and would clutter the map up). The FCS teams included are: Montana, Montana State, North Dakota State, Georgia Southern, Delaware, Jackson State, Jacksonville State, and Old Dominion. Also added were FCS schools set to join FBS in the coming years. That includes: UMass, Texas State, and UTSA.

Final thing is that I swapped in some better logos in some places. SDSU, UNLV and Utah State get their athletic logos in there instead of the school logo, and UWashington got their more traditional W logo, which I guess is preferred by the fan base. Also, Army didn't actually have their logo on the previous map (it was on a covered layer in the editable file). There might be some other logo swaps I forgot about.

I don't think I'll be making anymore edits to the map.

Edit: Added Temple, it seems like I had totally forgotten about them. My bad.

Saturday, November 5, 2011

History of all NFL Franchises

I've had this idea for a while. I think originally I had wanted to do it with baseball teams, because I was very surprised that some of them went so far back. I wanted to present in a time line form the lineages of the franchises of a league. The hardest part is to try to arrange it in someway as to also represent general geographic region. 

The only condition for any franchise to be included was to have at least played 1 official NFL game. There were a few franchises that only played a few games.

The biggest challenge in the arrangement was the merger years between the Cardinals and the Steelers, and the Steelers and Eagles. It would have been easier if I could have put these franchises right next to each other. I think the way I eventually settled it works ok. Since there were few teams during these merger years, the stretched lines don't get in the way of other teams. 

To make the image a little nicer, I decided to incorporate changes in team colors and logo changes. This was a little bit harder. I had to rely on Wikipedia and other sites to collect colors and images. When logo changes were to similar and I didn't have the right space, I just left them out. So the image isn't really exhaustive. 

I had also considered what do with with the AFL merger. Do I show the AFL, or only show the teams' history within the NFL. If I included the AFL, then I really would have had to include other teams' history outside of the NFL. Eventually I chose to use the arrows I did, to show how far the teams stretched outside the NFL. I hope it looks clear.

It may require zooming in to see the complete details of all the teams. 


Wednesday, November 2, 2011

Who Shot Scott Olsen: An Independent Investigation

I had seen some names leaked on various blogs of the alleged officer that may have shot Scott Olsen in Oakland. I kept looking for the documentation of it and there was none. Most likely, it was based on things like height and location. Probably the much bigger factor was that this officer was shot on camera with his name very clearly showing. Among the many many offices present in videos, this guy had somehow been implicated. I set aside to see if I could do my own research and corroborate the findings. If I couldn't, I hoped I would find some sort of information that would lead to more answers.

The first thing I decided to do was to become familiar with the setting based on watching multiple videos (linked later). Below is a general overview of important elements within a satellite picture of the location.

I actually downloaded the videos I used and was able to blow them up and watch them frame by frame, backwards and forwards, repeatedly, looking for any clues that may help.
The first video is a previous edit, but I used it because it pieces together a couple different sequences and perspectives: DO NOT PAY ATTENTION TO THE COMMENTARY AND EDITORIAL, I only used it for the raw footage.

The video (I will call it Video 1) has gotten a lot of play because it points out a specific police officer (or some law enforcement) who appears to be holding a shotgun (or something similar), backs off, and throws a flash bang grenade into a crowd that comes to the aid of Scott Olsen after he has fallen.
The video fingers the highlighted officer as the offender

I will refer to this specific camera angle as "Camera A" in the future. It is absolutely clear that the grenade was thrown from the police side of the barricade, and it is almost certain that the highlighted officer was the one that threw it. What the video does not show, unfortunately, is why Scott Olsen had fallen. The impact that brought down Olsen is obscured by a moving crowd. Within a few of the frames however, it is possible to spot Olsen falling to the ground.

Let's look at the (almost certain) grenade tosser a little closer, a couple of frames earlier.

Here, if you look close enough, you can see that he is pointing some sort of long barrel gun into the crowd. I will just call it a shotgun for the sake of brevity. The second thing I want to point out is his bare right hand. It may only appear to be a faint blob, but upon viewing many times frame by frame, I am certain that that blob is his hand, and that it is bare (or a beige colored glove?)

Go back to Video 1 posted above and watch closely between 1:36 and 1:38. As he lowers his shotgun, he lowers his hand with it.

Also, it is important to note that the officers here are all wearing gas masks. Remember that.

There is a second video (Video 2) which was of good service. I will not be linking to it however, since it clearly shows the name of the accused officer (I will alternately refer to him as Officer B___). I'm sure you can find it if you look hard enough, I just don't want to help in the effort. I don't want to contribute to staining someone's reputation without absolute proof of wrong doing.

Either way, the video also has some good information in it, and I will use it to highlight some points. The accused officer is pictured below on the right.

The first thing that you may notice is that he appears to be very tall. That stands out immediately. He looks at least a foot taller than the office to his right. In some of the reasoning behind implicating him, I have heard that his height made him easier to identify.

From the video, we can verify that at some point before (and very likely later during the incident), he was behind the northwest barricade.

It's just an image, don't click play
Here are the things to note from the video. Notice the "eens" of Walgreens on the building in the back, or notice the Walgreens logo on the window to the far left. This image (and the rest of the video it came from), puts him at the scene.

Second, notice his height, but let's look at it from a different angle.

The picture above is going to look very confusing with all of those lines, but let me explain. I used the angle of the lines made by the railing (almost certainly parallel in reality) to find the vanishing point of the angle.

Some reading on vanishing points and perspective if you are unfamiliar: Wikipedia Link

Now, if the officers are lined up parallel to the railing (which they almost certainly are), then the line that connects the vanishing point to the top of Officer B___'s head will be parallel to the ground and give us a means by which to compare the heights of the officers, despite the angle. The vertical line is to mark the point where the railing no longer follows a straight line. The officers lined up next to this farther section of the railing cannot be compared, but they may be too distant to matter in this case.

The point I am trying to prove in particular is that Officer B___ is clearly taller than the officers to his left. The right side does not matter as much. Just keep in mind that he does appear taller than the officers lined up semi-immediately to his left.

Also, note that none of the officers are wearing gas masks in this video. Video 2 takes place sometime before Video 1, but it is impossible to know by how much. There was enough time for them all to sport gas masks, so the make up and order of the line of officers could have changed. I bring this up to emphasize that Video 2 did not take place immediately before Video 1.

The third thing to note is that Officer B___ is wearing black gloves.

Now, if he had enough time to put on a gas mask, he would have had enough time to get a shotgun. Or he may have one holstered somewhere, but it is not apparent in the video.

In fact, during all of Video 2 you notice that none of the officers on the front line have shotguns, or anything that looks like a shotgun.

Now, go back to Video 1, and you will notice that aside from the circled officer, none of the other officers in the front line appear to be holding a shotgun either. You see nightsticks and shields.

So, no where else can we find an officer with a shotgun on the front line, except for the circled officer. Let's take a look again at Video 2.

1 "shotgun" spotted.

2 more "shotguns" spotted.

In fact as you look through Video 2 (which I didn't link you to, sorry), and in other videos, you see that the "guns" are in the back. Not on the front line. The officers on the front line seem to be there more or less as a barrier.

The officers in the back seem to be a different breed altogether. Their uniforms are markedly different.

Let's return to one more look at Video 1 to make my final point about Officer B___.

You may not have noticed the first time I put up this image, but look all the way on the right of this picture. There is a markedly tall man.

Now, you can't really do the vanishing point thing here because the angle is not tight. The vanishing point would be somewhere really far off your monitor to the left. I could have done it, but at that distance, your sensitivity to error becomes very high, and it isn't worth it. Since the angle is closer to being perpendicular, you can trust your eye. To me, the officer on the far right looks taller than the circled officer. Both appear to be standing upright, IN THIS FRAME.

Since I argued that Officer B___ had no one to the left of him that was taller, it would lead us to believe that he can't be the circled man. Now, there is a chance that the orders changed in such a way as to rearrange that, but I would argue against that by saying that people who are taller than Officer B___ are not common. The possibility that the circled man is Officer B___ is unlikely.

So who shot Scott Olsen?

Let me take this minute to introduce Video 3, overhead news helicopter footage (in the same orientation as the top image).

Back to the above question:

To investigate that I want to tackle two parameters:
1) Where was the impact on Olsen's body?
2) Which way was he facing at the moment of impact?

With those two, we can at least determine where the shot came from.

Number 1 is easier to answer than number 2.

It appears from images that impact may have occurred near his left temple. There is a collection of blood particularly there.

A second clue that corroborates this is the stories of his neurological problems that have arisen, particularly his difficulty speaking. Guardian article

The speech center they are almost certainly referring to is Broca's Area

See the highlighted blue
See the tongue (coronal section)

So I will infer that the impact was made to his left side (inferior frontal lobe, not actually temporal, even if it appears near the temple).

Now to determine where the object that hit him came from, we have to try to figure out which direction he was facing.

Through the different videos that show him before things got out of hand, he is seen in different places, facing different directions, so that does not particularly help.

Looking through the videos, there is no clear picture of him during or right before impact.

There are some clues however, that we can build from.

First is the apparent orientation of his body after he had already fallen.

Olsen appears to be oriented with his head towards the "Camera A". His backpack to our right suggests that he is lying on its side. But how was he standing?

Our second clue comes from a short piece of Video 1. It is visible for only a few frames around 1:18, but I was able to spot Olsen's falling body. It is really hidden in there in the gaps between people running. It would be really difficult to get it in the Youtube video, but you can try. If you really want to give it a shot, I would say download the video (using a Chrome/Firefox extension) and view it through a video software that allows you to go frame by frame. I use Avidemux, which is a free software. You can click the right and left arrows to go forward and back by one frame.

EDIT: Just decided to include the fall with a slideshow. Look for the falling body between the man in black and the lady in orange.

If the slideshow opens a new window when you click, you can close it and come back here.

With that said, the important thing to note from those few frames is that Olsen's body falls towards the camera, and it appears to fall limp (only by the force of gravity), he doesn't appear to stumble or stagger. It just looks like a collapsing object under only the force of gravity. Secondly, my best guess is that he is falling face first, mostly because his torso appears black as he is falling (The color of his shirt, exposed only on the front).

The next point I want to make is that just because he is on his side on the floor does not mean he was standing that way and fell to the side. It is very possible that his backpack (which at least looks heavy) twisted to one side after he had fallen, and twisted his body to rest on its side.

From here I think there are three possible directions he could have been facing on impact. I will use the direction of Camera A as reference. See the image all the way at the top if you need to set your orientation straight.

1) Looking generally left (as seen by Camera A)
Supported by the fact that that is the direction he is facing on the ground.

1A) If a projectile had caused the impact to his head, it would have come from the direction of Camera A. Rather oddly, he would have then fallen in the direction of the impact. This seems counter-intuitive. It is not entirely impossible though. If he was falling limp, then he was falling because of lost consciousness, not because of momentum from the impact. Even then, it would be strange to fall directly to one side if knocked unconscious. Also, it would mean that the projectile came from an area where for the most part, protesters were fleeing at that time. It is not impossible that a protester picked up a rock or something and threw it at him, but it seems that there is no reason to believe that that was the case. There is no other piece of evidence that supports this. I would rate this as non-reliable alternative.

1B) His skull fracture was not due to an impact from a projectile, but rather by impact with the ground. I think the doctors will have a better time with this one. It would be impossible for us to tell if this is the case. However, 1B is supported by some video evidence. Particularly that Olsen began to fall almost immediately after two successive explosions in the opposite direction from Camera A. If he was blown back by the explosions, it would have carried him towards the camera. The two explosions can be found at 1:18 in Video 1 (very easy to spot), and 0:44 in Video 3. They occur pretty quickly one after the other (look for the big ones). These two explosions can be used to sync up the two videos as well. 1B is disputed by the fact that he appears to fall limply and not due to momentum from impact. This is a possible alternative.

2) Looking generally right (as seen by Camera A)
This is supported by the observation that the impact would have come from the opposite side of his fall. This theory as well suggests that he was blown back by impact, which does not seem to be the case. I may be wrong in my interpretation of the video however. It is only for a brief window, but I was convinced of what I saw.

Still it would mean that the projectile came from an area concentrated mostly of protesters (see postscript), or from a tight angle shot from the far left (south) end of the police line. I find it hard to imagine that an officer would take that shot, considering the possible risk to other officers. This is a somewhat-reliable alternative.

3) Looking generally towards Camera A
Supported by my interpretation of the video, that he falls limply face first towards the camera. It would imply that a projectile struck him coming from the area of the police line.

A closer range impact of a projectile would also fit better with the idea that something hit him hard enough and directly enough to fracture his skull and knock him unconscious.

My unexpert opinion is that he was most likely hit by something with high energy. Higher energy than something thrown by hand. More likely something gunpowder propelled. I am not sure if a fall from standing height would provide enough energy enough to fracture a skull, but that all depends on what part of the skull as well. I would also be more likely to believe that a closer range projectile (shorter time for initial energy to dissipate) would be more likely to cause that sort of damage. I am not really an expert though.

The last thing I want to present is the footage of Olsen's fall in Video 3. It starts at around 44 seconds in the left lower area. Again, it happened right after the two consecutive explosions we talked about before. I was able to use the elements from Video 1 to sync up time and space with Video 3. I cut out the frames of interest below. Unfortunately, it is very hard to interpret, and even harder to try to find what hit him. There is a black smoke cloud that moves over the area at the worst possible time, making it even harder to get something out of these images.

If the slideshow opens a new window when you click, you can close it and come back here.

I did take my best guess as to where the projectile may have come from. There is a small flash that may be muzzle fire just at the right moment. I have circled the person who I suspect.

Please note that the footage I used (for Video 3) was at 3 frames/second, so there is about a 333 millisecond gap between each frame. Also note that the slideshow jumps from slide 5 to 15. The black smoke makes it impossible to see anything in frames 6 - 14. But looking closely at frames 1 - 5, you can see that his body is already moving towards the direction of his fall. My guess is that his fall is already initiated before frame 5.

Remember again that Video 3 is in the same orientation as the map posted all the way at the beginning. The diagonal line you see cutting from the lower left corner to the upper middle edge is the barrier that separates the police line. Ignore the horizontal line near the bottom. It was too much trouble to get rid of it and doesn't interfere. It is not from the footage. It is from the software I used to play the video.

No projectile is seen. In other parts of the same video, tear gas canisters are clearly identifiable. I would assume it would be visible in these frames as well. There is also no flash bang after the impact, suggesting that it was not a flash bang grenade either. My understanding is that neither of these (tear gas canister or flash bang grenade) are fired with a long barrel gun. A more likely scenario is that Olsen was hit with a rubber bullet or with a lead shot bean bag.

There is also the previous scenario that he was blown back and unconscious by one of the 2 consecutive explosions (or an unapparent projectile) . It is highly unlikely that the explosions would have fractured his skull from that distance. However, it may be possible that he fractured his skull when he hit the ground (as discussed in the scenario where Olsen would be facing left as seen by Camera A)

The circled individual in my frame-by-frame above seems to be in the same area as the circled individual in Video 1. Based on the location and other circumstances, it is likely that the two circled individuals are the same person.

This is the best interpretation that the evidence allows, although it is not absolutely conclusive.


Most Likely Scenario:
The best interpretation is that the office circled and implicated in Video 1 (the one that tossed the grenade into the crowd that was helping Scott Olsen) is the same one that shot Olsen. However, it is very unlikely that the circled officer is the tall Officer B___ that has been implicated in some sources on the internet. Nor is it the two officers to his left and right, whose names have also been published as possible suspects.

It is unlikely that the circled officer in Video 1 is present in Video 2, unless he is one of the officers in the back. Throughout Video 2, there are officers visible farther in the back with long barrel guns. My interpretation is that one of these officers (likely from a different department or unit) moved to the front line at some point after projectiles were in the air. For whatever reason, he then fired some sort of projectile at Olsen's head. The projectile is not apparent at all in any of the footage. Suggesting that it was either a rubber bullet or a lead shot bean bag, or possibly some other projectile that A) would not be visible from the video and B) is shot out of a long barrel gun.

Second Most Likely:
Olsen was knocked over by the second of 2 explosions that occurred near him. The impact of his head hitting the ground caused his skull fracture.

I debated with myself on whether to include this or not. I finally decided that for the sake of the integrity of everything else I talked about, I have to at least mention this as well. I want to appear to have no agenda other than finding the truth. I imagine if anyone looks at the footage extensively and tries to analyze what they see, they will come to similar conclusions as I have, and they will wonder why I didn't mention the following.

From going through all the videos, I am almost certain that there were fireworks being fired into police line from where the protesters were. Even worse, they seem to be deliberately aimed in the direction of the police officers. And even worse than that, it is very likely (although not absolutely certain) that the fireworks preceded the tear gas canisters and the flash bang grenades. You need to think critically about the things you see in the air, and not automatically assume that they are something fired by the police. With a bit of closer inspection, I came to the conclusions I just made.

Of course, I do not discount at all the possibility of agents provocateurs. It is a very real possibility. In fact there are already some videos of OWS in Oakland specifically that identify police officers posing as protesters. Youtube Link

Saturday, October 15, 2011

2011 Power Rankings: Week 5

Last week didn't go particularly well, especially against the spread. The predictor was 8-5 picking straight winners, and a coin flip (6-6-1) against the spread. Not very impressive. Here is what happened with the picks:

Away Team Home Team Chance of Away Win Outcome Spread against Away Team Chance of Away Cover Outcome
KAN IND 48.2 Correct 2 52 Missed
ARI MIN 48.7 Missed 2.5 54 Correct
PHI BUF 38.7 Missed -3 33 Missed
OAK HOU 35.8 Missed 5.5 46 Missed
NOR CAR 57.5 Correct -6.5 45 Correct
CIN JAC 54.8 Missed 2 59 Missed
TEN PIT 49.9 Missed 3 56 Correct
SEA NYG 27.4 Correct 10 46 Correct
TAM SFO 39.1 Correct 3 45 Correct
NYJ NWE 26.6 Correct 9 43 PUSH
SDG DEN 50.3 Correct -4 42 Missed
GNB ATL 57.3 Correct -6 45 Missed
CHI DET 28.1 Correct 5 37 Correct

Here are the picks for this week. Based on Saturday afternoon spreads. Let's try it again one more time, but I might have to mess around more with the regression model for predicting against the spread.

Away Team Home Team Chance of Away Win Spread against Away Team Chance of Away Cover
STL GNB 21.2 14 46.0
JAC PIT 37.7 12.5 62.3
PHI WAS 32.1 -3 26.9
SFO DET 38.4 4 46.2
CAR ATL 41.0 4 48.9
IND CIN 36.2 7 49.9
BUF NYG 49.3 3 55.2
HOU BAL 44.9 8 60.7
CLE OAK 40.8 6.5 53.7
DAL NWE 32.9 6.5 45.3
NOR TAM 61.0 -5.5 50.2
MIN CHI 53.5 2.5 58.4
MIA NYJ 44.7 7 58.6

The picks are heavily favoring the home teams this week. Something to keep an eye on perhaps.

Now finally the rankings:

Rank Team Rating Win - Loss Last Week
32 STL 0.021 0-4 32
31 MIA 0.072 0-4 31
30 IND 0.095 0-5 30
29 KAN 0.114 2-3 28
28 DEN 0.134 1-4 26
27 CAR 0.161 1-4 25
26 SEA 0.202 2-3 29
25 JAC 0.207 1-4 27
24 NYJ 0.216 2-3 22
23 ARI 0.252 1-4 18
22 PHI 0.270 1-4 17
21 CLE 0.288 2-2 24
20 TAM 0.296 3-2 16
19 CHI 0.334 2-3 15
18 ATL 0.378 2-3 21
17 CIN 0.461 3-2 19
16 TEN 0.557 3-2 7
15 MIN 0.563 1-4 23
14 NYG 0.570 3-2 8
13 DAL 0.606 2-2 14
12 PIT 0.636 3-2 20
11 OAK 0.723 3-2 12
10 WAS 0.779 3-1 6
9 SDG 0.790 4-1 13
8 NOR 0.818 4-1 11
7 BUF 0.848 4-1 10
6 SFO 0.857 4-1 9
5 BAL 0.858 3-1 4
4 HOU 0.908 3-2 3
3 DET 0.910 5-0 5
2 GNB 0.959 5-0 1
1 NWE 0.978 4-1 2

Just hurried to get this up, will get more into it next week possibly.

Thursday, October 6, 2011

2011 Power Rankings: Week 4

Quickly the rankings:

Rank Team Rating Win - Loss
32 STL 0.011 0-4
31 MIA 0.051 0-4
30 IND 0.068 0-4
29 SEA 0.075 1-3
28 KAN 0.092 1-3
27 JAC 0.204 1-3
26 DEN 0.228 1-3
25 CAR 0.260 1-3
24 CLE 0.272 2-2
23 MIN 0.340 0-4
22 NYJ 0.357 2-2
21 ATL 0.369 2-2
20 PIT 0.412 2-2
19 CIN 0.422 2-2
18 ARI 0.432 1-3
17 PHI 0.442 1-3
16 TAM 0.490 3-1
15 CHI 0.503 2-2
14 DAL 0.630 2-2
13 SDG 0.663 3-1
12 OAK 0.677 2-2
11 NOR 0.712 3-1
10 BUF 0.728 3-1
9 SFO 0.730 3-1
8 NYG 0.745 3-1
7 TEN 0.768 3-1
6 WAS 0.815 3-1
5 DET 0.843 4-0
4 BAL 0.893 3-1
3 HOU 0.952 3-1
2 NWE 0.960 3-1
1 GNB 0.973 4-0
And for the first time, projections (based on Thursday night spreads):
Away Team Home Team Chance of Away Win Spread against Away Team Chance of Away Cover
TEN PIT 49.9 3 56
SEA NYG 27.4 10 46
CIN JAC 54.8 2 59
NOR CAR 57.5 -6.5 45
OAK HOU 35.8 5.5 46
PHI BUF 38.7 -3 33
KAN IND 48.2 2 52
ARI MIN 48.7 2.5 54
TAM SFO 39.1 3 45
NYJ NWE 26.6 9 43
SDG DEN 50.3 -4 42
GNB ATL 57.3 -6 45
CHI DET 28.1 5 37

Wednesday, September 28, 2011

2011 Power Rankings: Week 3

First the rankings

Rank Team Rating Win - Loss
32 STL 0.014 0-3
31 KAN 0.014 0-3
30 IND 0.036 0-3
29 MIA 0.072 0-3
28 SEA 0.128 1-2
27 ATL 0.143 1-2
26 CHI 0.273 1-2
25 JAC 0.363 1-2
24 CIN 0.373 1-2
23 PHI 0.400 1-2
22 DEN 0.411 1-2
21 CAR 0.424 1-2
20 ARI 0.430 1-2
19 TAM 0.467 2-1
18 CLE 0.479 2-1
17 SDG 0.491 2-1
16 MIN 0.497 0-3
15 NOR 0.509 2-1
14 TEN 0.582 2-1
13 NYJ 0.601 2-1
12 DAL 0.625 2-1
11 PIT 0.647 2-1
10 WAS 0.691 2-1
9 NYG 0.780 2-1
8 BUF 0.794 3-0
7 SFO 0.794 2-1
6 BAL 0.798 2-1
5 OAK 0.898 2-1
4 DET 0.908 3-0
3 HOU 0.922 2-1
2 NWE 0.929 2-1
1 GNB 0.958 3-0

I finally convinced myself to use win/loss data in the ratings. Mostly because Minnesota ended up being one of my highest ranked teams this week (they had another game where they led for most of the game). I added another step between the raw numbers (integral difference in log-score) and the final rating. I used an arctan function (which fits the shape I wanted: sigmoidal with horizonal asymptotes), to transform raw scores into a new number that would be sensitive in the middle (most common) range, and flatten out at asymptotes at the end. Wins get a separate arctan function from losses. Here they are:

Wins get the blue line treatment. The best performances approach 100 and the worst performances approach 40. For losses, the worst performances reach 0 and the best performances reach 60. Since for a given game, the two opponents will have the same score in magnitude (one being negative), the formulas end up working in such a way that for a given game, the sum of the two teams (arctan'd) ratings always sum to 100. For that reason, I like to think of them as "win shares". Anyway, I like the way it works, and i like the way the rankings look. From there, as usual, each game score was weighed 90 - 100% by strength of opponent (rolling back self-inclusive strength of opponent score as I explained before). Then finally, that score is normalized between 0-1, assuming normal distribution.

Secondly, like I have done before, I ran the system into last year's games (I only kept data for Weeks 1 - 14 though). I broke down the rating into a few variables, and also added some variables (like East/West travel and North/South travel for the road team) and ran a regression for the dependent variable score differential. The ultimate goal would be to find a formula to pick against the spread, so that is why I am focusing here on point differential and not the binomial win/loss.

With this new twist, I got the best results yet. The std. error is as high as previous tries (~14.6, which is pretty huge once you realize we are talking about point differential), but the variables, and the model seem to be more statistically significant than before. 

I eventually took out the travel numbers and just used the rating numbers (travel numbers had not even a hint of statistical significance and I think one of them went in the wrong direction). I split the remaining rating up into 4 variables - 1) Away team's previous performance on the road, 2) Home team's previous performance at home, 3) Away team's previous performance at home, and 4) Home team's previous performance on the road. As it has been doing for a while now with these regressions, variable number 4, the home team's previous performance on the road is the best predictor (highest magnitude and highest significance) of the point differential. Second most is variable 1. In general, something in this data is telling me that the best teams are the ones that can play well on the road. I guess the assumption is that most teams can play well at home, but not much fewer play as well on the road. 

Linear regression output
Margin is always Home final score subtracted by Away final score
Dependent variables are split home/away versions of my final rating
The next step was to see the predictions of the formula on last year's data (I know it's kind of circular). I had Excel calculate the "expected margin" or equivalently "predicted spread against the away team". I had long ago collected the closing spreads on each of these games too. So I had Excel compare the two: my predicted spread and Vegas' closing spread. 

The next step was to craft a way to have Excel make a pick: either the road team would cover or not. If you assume a normal distribution around the mean "predicted spread" with std dev. from the regression formula, you can easily get Excel to give you the probability of a cover. 

From there I calculated the number of correct and incorrect picks. There were 3 pushes on the year, so I discounted those. I also did not run it for weeks 1 - 4, since I assumed that there wasn't enough information for those weeks, and no weeks 15 - 17 because I didn't have that data. 

With that said, the system went a fairly unimpressive 57.3% against the spread. Still not a complete failure though. Looking through some of the data to inspect what sort of patterns I could see, I discovered a couple things. Weeks 11 and 12 were nightmares. The system batted 375 for those weeks. Every other week is above 500. Did something strange happen in those weeks? Secondly, as the confidence of the systems picks went up (the probability of a cover, how distance from 50%?), it's correctness did NOT increase. My data set might be too small, but I suspect that the relationships here may not be linear actually, which is assumed when using linear regression. 

I tweaked the coefficients a little bit, and the winning percentage seems to max out at 59.4%, which feels a lot better than 57. But of course, when tweaking the coefficients in order to get the maximum number of wins, it is the ones closest to 50% that I am moving from losses to wins (transforming the probability from 49% to 51% for example). This further screwed up the idea that the higher confidence picks should work out better. 

Last note is that when picking straight winners, the system went 65.8% correct on its picks.

That's all for now. I will probably use the same formula for next week, and will start publishing predictions (probability of victory/probability of cover) in before Week 5. 

Thursday, September 22, 2011

2011 Power Rankings: Week 2

Here for methodology

Some oddities, but it will straighten out as the sample size increases.

Rank Team Rating Win - Loss
32 KAN 0.005 0-2
31 SEA 0.023 0-2
30 STL 0.072 0-2
29 MIA 0.100 0-2
28 IND 0.125 0-2
27 TAM 0.142 1-1
26 SDG 0.160 1-1
25 JAC 0.160 1-1
24 ATL 0.280 1-1
23 CIN 0.318 1-1
22 NOR 0.464 1-1
21 TEN 0.490 1-1
20 PIT 0.508 1-1
19 WAS 0.519 2-0
18 DAL 0.556 1-1
17 CLE 0.577 1-1
16 CHI 0.584 1-1
15 DEN 0.595 1-1
14 CAR 0.614 0-2
13 ARI 0.620 1-1
12 BUF 0.714 2-0
11 NYG 0.725 1-1
10 BAL 0.730 1-1
9 NYJ 0.734 2-0
8 PHI 0.773 1-1
7 GNB 0.796 2-0
6 MIN 0.815 0-2
5 OAK 0.823 1-1
4 SFO 0.835 1-1
3 NWE 0.871 2-0
2 DET 0.923 2-0
1 HOU 0.972 2-0

Wednesday, September 14, 2011

2011 Power Rankings: Week 1

Last year's rankings turned out pretty well I guess. I posted 4 times, for results following weeks 11-14.

Week 11
Week 12
Week 13
Week 14

Green Bay's rankings were 1, 1, 1, 2. The rating system emphatically loved this team.
Pittsburgh's rankings were 3, 3, 7, 6.
One reason for the difference off may be how much the system liked 2 other NFC teams.
New Orleans ranked 4, 2, 2, 1 (but they were upset by the Seahawks, which the system regularly ranked low: 21, 25, 26, 26).
Philadelphia ranked 2, 5, 5, 5.
New England (6, 6, 4, 4) and Baltimore (5, 4, 3, 3) were the other AFC front runners.
The Jets (11, 10, 10, 13) and Chicago (14, 13, 12, 17) never really did as well in the system as they did in the playoffs.

It's really only because Green Bay won the Super Bowl that I thought this experiment might be worth continuing this year. Otherwise, I may have given up on the idea.

The results for 2011 Week 1 are:

Rank Team Rating
32 IND 0.029
31 KAN 0.046
30 PIT 0.052
29 ATL 0.105
28 NOR 0.133
27 STL 0.190
26 TEN 0.191
25 SEA 0.219
24 NYJ 0.220
23 MIA 0.241
22 TAM 0.275
21 DEN 0.287
20 SDG 0.350
19 NYG 0.403
18 CLE 0.443
17 ARI 0.465
16 CAR 0.535
15 CIN 0.557
14 WAS 0.597
13 MIN 0.650
12 OAK 0.713
11 DET 0.725
10 NWE 0.759
9 DAL 0.780
8 SFO 0.781
7 JAC 0.809
6 PHI 0.810
5 GNB 0.867
4 CHI 0.895
3 BAL 0.948
2 BUF 0.954
1 HOU 0.971

This is a reflection of nothing but a teams performance in Week 1. Again, it is blind of any human interpretation or bias.

I'll touch up on the methodology a little bit:

The basic and most fundamental premise for the rating system was based on the idea that occurred to me one day to look at integral difference. Those who have taken calculus may be familiar with the idea, but it is not as complex of an idea as anything in calculus. One way that the "integral" is described in calculus is as the area under a curve.

Area under the curve example. y = x^2, between 0 and 2
For a sporting event, my idea was to use time as the x-axis and score as the y-axis.

Each team would have it's own "curve" on the graph, and I wanted to evaluate the performance of one of the teams as the difference in the area. That is where the "difference" half of "integral difference" comes from. The team with the larger area performed better (regardless of the final score), and the measure of that difference (how much bigger its area was) would be a measure of its overall performance in the game.

Thursday night game Week 1 diagram
One of the tweaks I added to that system was attaching a logarithm to the difference. Its hard to show on the above graph since it is not the "difference in log(score)" but the "log of the difference(score)". I liked the range and sloping of ln^2(Diff +1). The reason for this, as I also explained in last years' posts, is to make the importance of each extra score diminish as the lead becomes larger. For example, the log-difference between 7 and 14 is a greater than the log-difference of 28 and 35.
Relationship between actual difference (x) and my "skewed" difference (y). As the lead becomes larger, it's rate of importance decreases. y = ln^2(x +1)
The results you see for Week 1 above are exactly this measure for each teams performance this week. For all subsequent weeks, the scoring will be slightly different due to the added factor of strength of schedule.

Last year, I struggled a lot with how exactly to do SOS. I finally resigned to going only one team back. This means that I only looked at the previous performance of the current opponent. To be more precise though, those previous performances should also be skewed by the strength of the opponent's previous opponents. I did not do this with last year's system. The circularity becomes kind of difficult to deal with.

After struggling even more than I did last year, I found a system (doable with Excel) that allowed me to do a true SOS system. I implemented what I will call a moving-backwards-SOS-chain. I was able to avoid circularity by considering only past results and not looking into future performance. For this reason, obviously, Week 1 has no SOS factor. The Excel layout I had to configure to do this drove me nuts, and skated around the brink of circularity. However, I am fairly confident that it does what I want it to. Look at the previous opponents of the previous opponents of the previous opponents ...... of the current opponent. Always starting at Week 1 and ending at the current week. I still need to decide on a "skewing weight" for the strength of schedule. Last year, I almost arbitrarily decided on 90-100 %, meaning that the raw score would be multiplied by some number between 0.9 and 1 depending on the strength of opponent scored against.

Wednesday, September 7, 2011

College Football Map: Part II

EDIT: If you happen upon this page, please redirect to http://mhermher.blogspot.com/2013/09/college-football-map-iv.html to see the newer version.

Here is a second version of the College Football map, after taking some input from reddit/r/sports

Cosmetic changes: got rid of gradients, replaced with solid color plus logo. Unfortunately, I couldn't find transparent vector images for some of the team logos, and had to use university logos. Usually it is for the smaller schools. Either way, the cosmetic changes make for a much much better looking map, I admit.

Area changes: A lot of people disagreed with my inclusion of the Dakotas within Minnesota, so I switched that to Nebraska. Again, I chose not to include any FCS teams, so the schools in the Dakotas and Montana did not qualify for the map. I also stretched Oklahoma a bit farther south. Expanded GA Tech, Colorado State, and Michigan. I changed up Indiana and NC a lot. I am still not 100% sure about a lot of the map.

Monday, September 5, 2011

United States of College Football

I ran into this map (Baseball) a while ago on Reddit, and I really liked the idea. I think later, somewhere else I saw an NFL one too. I have wanted to make one for college football for a while now, but I didn't really know how.

Recently, I had started messing around with Inkscape, and I am really liking it so far, as far as a program with which to make maps. This is my first Inkscape map.

I decided to make a map similar to the baseball one above, except with college football teams. It was hard to have a cut-off of which teams to include and which not to. At first I thought I could just use the 6 BCS conferences only. I realized soon that I had to at least include the Mountain West too, other wise that area would either be empty or some Pac-10 and Big 12 teams would have giant areas (in order to cover the whole map). So I included the MWC.

Then I ran into the problem of Nevada and Idaho. Each state has 2 FBS schools (UNLV & Nevada; Boise St & Idaho), 1 in the MWC, and 1 in the WAC. If I were to include the MWC and not the WAC, it would probably create an inaccurate picture of those states. So I decided to include the WAC too. At this point, there was really no reason not to throw in the MAC, Sun Belt, and C-USA. That is what I did. I went on to include every conference in FBS. Although I didn't count, there should be all 120-some schools on this map. 

I made the map realizing that 1) There is no perfect way to make this map, since there will be a ridiculous amount of overlap, irregular shapes, enclaves, exclaves, etc., and a truly accurate map might look very ugly. and 2) I don't know much about how the different parts of the countries watch college football. A lot of the splitting up is done by guess work based on geography and historic success of the football program. From any travels I have done, I know that most major cities are more or less split if there are 2 programs nearby. You will notice that most major cities are split. LA is split between USC and UCLA (obviously), Chicago is split 3 ways between Northwestern, Illinois, and Notre Dame. SF is split between Stanford and Cal (that one was a little hard to do, shape-wise). NYC is split between UConn and Rutgers (those are the 2 closest schools; Syracuse dominates most of upstate NY. There are many many other cities that are split.

I also did a gradient coloring for each of the areas. I didn't want to leave names on the map, because with 120 some names, it would have been too cluttered. I hope the colors and location will be enough to identify each of the shapes. I grabbed the colors from each of the football program's Wikipedia page (where available), using a color dropper Chrome-extension. It's a pretty neat tool. 

So here is version 1, quite possibly the last version if I don't want to pursue this any further. I hope I will go on to fix this up a little with better information about these distributions across the country.