Wednesday, September 28, 2011

2011 Power Rankings: Week 3

First the rankings

Rank Team Rating Win - Loss
32 STL 0.014 0-3
31 KAN 0.014 0-3
30 IND 0.036 0-3
29 MIA 0.072 0-3
28 SEA 0.128 1-2
27 ATL 0.143 1-2
26 CHI 0.273 1-2
25 JAC 0.363 1-2
24 CIN 0.373 1-2
23 PHI 0.400 1-2
22 DEN 0.411 1-2
21 CAR 0.424 1-2
20 ARI 0.430 1-2
19 TAM 0.467 2-1
18 CLE 0.479 2-1
17 SDG 0.491 2-1
16 MIN 0.497 0-3
15 NOR 0.509 2-1
14 TEN 0.582 2-1
13 NYJ 0.601 2-1
12 DAL 0.625 2-1
11 PIT 0.647 2-1
10 WAS 0.691 2-1
9 NYG 0.780 2-1
8 BUF 0.794 3-0
7 SFO 0.794 2-1
6 BAL 0.798 2-1
5 OAK 0.898 2-1
4 DET 0.908 3-0
3 HOU 0.922 2-1
2 NWE 0.929 2-1
1 GNB 0.958 3-0

I finally convinced myself to use win/loss data in the ratings. Mostly because Minnesota ended up being one of my highest ranked teams this week (they had another game where they led for most of the game). I added another step between the raw numbers (integral difference in log-score) and the final rating. I used an arctan function (which fits the shape I wanted: sigmoidal with horizonal asymptotes), to transform raw scores into a new number that would be sensitive in the middle (most common) range, and flatten out at asymptotes at the end. Wins get a separate arctan function from losses. Here they are:

Wins get the blue line treatment. The best performances approach 100 and the worst performances approach 40. For losses, the worst performances reach 0 and the best performances reach 60. Since for a given game, the two opponents will have the same score in magnitude (one being negative), the formulas end up working in such a way that for a given game, the sum of the two teams (arctan'd) ratings always sum to 100. For that reason, I like to think of them as "win shares". Anyway, I like the way it works, and i like the way the rankings look. From there, as usual, each game score was weighed 90 - 100% by strength of opponent (rolling back self-inclusive strength of opponent score as I explained before). Then finally, that score is normalized between 0-1, assuming normal distribution.

Secondly, like I have done before, I ran the system into last year's games (I only kept data for Weeks 1 - 14 though). I broke down the rating into a few variables, and also added some variables (like East/West travel and North/South travel for the road team) and ran a regression for the dependent variable score differential. The ultimate goal would be to find a formula to pick against the spread, so that is why I am focusing here on point differential and not the binomial win/loss.

With this new twist, I got the best results yet. The std. error is as high as previous tries (~14.6, which is pretty huge once you realize we are talking about point differential), but the variables, and the model seem to be more statistically significant than before. 

I eventually took out the travel numbers and just used the rating numbers (travel numbers had not even a hint of statistical significance and I think one of them went in the wrong direction). I split the remaining rating up into 4 variables - 1) Away team's previous performance on the road, 2) Home team's previous performance at home, 3) Away team's previous performance at home, and 4) Home team's previous performance on the road. As it has been doing for a while now with these regressions, variable number 4, the home team's previous performance on the road is the best predictor (highest magnitude and highest significance) of the point differential. Second most is variable 1. In general, something in this data is telling me that the best teams are the ones that can play well on the road. I guess the assumption is that most teams can play well at home, but not much fewer play as well on the road. 

Linear regression output
Margin is always Home final score subtracted by Away final score
Dependent variables are split home/away versions of my final rating
The next step was to see the predictions of the formula on last year's data (I know it's kind of circular). I had Excel calculate the "expected margin" or equivalently "predicted spread against the away team". I had long ago collected the closing spreads on each of these games too. So I had Excel compare the two: my predicted spread and Vegas' closing spread. 

The next step was to craft a way to have Excel make a pick: either the road team would cover or not. If you assume a normal distribution around the mean "predicted spread" with std dev. from the regression formula, you can easily get Excel to give you the probability of a cover. 

From there I calculated the number of correct and incorrect picks. There were 3 pushes on the year, so I discounted those. I also did not run it for weeks 1 - 4, since I assumed that there wasn't enough information for those weeks, and no weeks 15 - 17 because I didn't have that data. 

With that said, the system went a fairly unimpressive 57.3% against the spread. Still not a complete failure though. Looking through some of the data to inspect what sort of patterns I could see, I discovered a couple things. Weeks 11 and 12 were nightmares. The system batted 375 for those weeks. Every other week is above 500. Did something strange happen in those weeks? Secondly, as the confidence of the systems picks went up (the probability of a cover, how distance from 50%?), it's correctness did NOT increase. My data set might be too small, but I suspect that the relationships here may not be linear actually, which is assumed when using linear regression. 

I tweaked the coefficients a little bit, and the winning percentage seems to max out at 59.4%, which feels a lot better than 57. But of course, when tweaking the coefficients in order to get the maximum number of wins, it is the ones closest to 50% that I am moving from losses to wins (transforming the probability from 49% to 51% for example). This further screwed up the idea that the higher confidence picks should work out better. 

Last note is that when picking straight winners, the system went 65.8% correct on its picks.

That's all for now. I will probably use the same formula for next week, and will start publishing predictions (probability of victory/probability of cover) in before Week 5. 

Thursday, September 22, 2011

2011 Power Rankings: Week 2

Here for methodology

Some oddities, but it will straighten out as the sample size increases.

Rank Team Rating Win - Loss
32 KAN 0.005 0-2
31 SEA 0.023 0-2
30 STL 0.072 0-2
29 MIA 0.100 0-2
28 IND 0.125 0-2
27 TAM 0.142 1-1
26 SDG 0.160 1-1
25 JAC 0.160 1-1
24 ATL 0.280 1-1
23 CIN 0.318 1-1
22 NOR 0.464 1-1
21 TEN 0.490 1-1
20 PIT 0.508 1-1
19 WAS 0.519 2-0
18 DAL 0.556 1-1
17 CLE 0.577 1-1
16 CHI 0.584 1-1
15 DEN 0.595 1-1
14 CAR 0.614 0-2
13 ARI 0.620 1-1
12 BUF 0.714 2-0
11 NYG 0.725 1-1
10 BAL 0.730 1-1
9 NYJ 0.734 2-0
8 PHI 0.773 1-1
7 GNB 0.796 2-0
6 MIN 0.815 0-2
5 OAK 0.823 1-1
4 SFO 0.835 1-1
3 NWE 0.871 2-0
2 DET 0.923 2-0
1 HOU 0.972 2-0

Wednesday, September 14, 2011

2011 Power Rankings: Week 1

Last year's rankings turned out pretty well I guess. I posted 4 times, for results following weeks 11-14.

Week 11
Week 12
Week 13
Week 14

Green Bay's rankings were 1, 1, 1, 2. The rating system emphatically loved this team.
Pittsburgh's rankings were 3, 3, 7, 6.
One reason for the difference off may be how much the system liked 2 other NFC teams.
New Orleans ranked 4, 2, 2, 1 (but they were upset by the Seahawks, which the system regularly ranked low: 21, 25, 26, 26).
Philadelphia ranked 2, 5, 5, 5.
New England (6, 6, 4, 4) and Baltimore (5, 4, 3, 3) were the other AFC front runners.
The Jets (11, 10, 10, 13) and Chicago (14, 13, 12, 17) never really did as well in the system as they did in the playoffs.

It's really only because Green Bay won the Super Bowl that I thought this experiment might be worth continuing this year. Otherwise, I may have given up on the idea.

The results for 2011 Week 1 are:

Rank Team Rating
32 IND 0.029
31 KAN 0.046
30 PIT 0.052
29 ATL 0.105
28 NOR 0.133
27 STL 0.190
26 TEN 0.191
25 SEA 0.219
24 NYJ 0.220
23 MIA 0.241
22 TAM 0.275
21 DEN 0.287
20 SDG 0.350
19 NYG 0.403
18 CLE 0.443
17 ARI 0.465
16 CAR 0.535
15 CIN 0.557
14 WAS 0.597
13 MIN 0.650
12 OAK 0.713
11 DET 0.725
10 NWE 0.759
9 DAL 0.780
8 SFO 0.781
7 JAC 0.809
6 PHI 0.810
5 GNB 0.867
4 CHI 0.895
3 BAL 0.948
2 BUF 0.954
1 HOU 0.971

This is a reflection of nothing but a teams performance in Week 1. Again, it is blind of any human interpretation or bias.

I'll touch up on the methodology a little bit:

The basic and most fundamental premise for the rating system was based on the idea that occurred to me one day to look at integral difference. Those who have taken calculus may be familiar with the idea, but it is not as complex of an idea as anything in calculus. One way that the "integral" is described in calculus is as the area under a curve.

Area under the curve example. y = x^2, between 0 and 2
For a sporting event, my idea was to use time as the x-axis and score as the y-axis.

Each team would have it's own "curve" on the graph, and I wanted to evaluate the performance of one of the teams as the difference in the area. That is where the "difference" half of "integral difference" comes from. The team with the larger area performed better (regardless of the final score), and the measure of that difference (how much bigger its area was) would be a measure of its overall performance in the game.

Thursday night game Week 1 diagram
One of the tweaks I added to that system was attaching a logarithm to the difference. Its hard to show on the above graph since it is not the "difference in log(score)" but the "log of the difference(score)". I liked the range and sloping of ln^2(Diff +1). The reason for this, as I also explained in last years' posts, is to make the importance of each extra score diminish as the lead becomes larger. For example, the log-difference between 7 and 14 is a greater than the log-difference of 28 and 35.
Relationship between actual difference (x) and my "skewed" difference (y). As the lead becomes larger, it's rate of importance decreases. y = ln^2(x +1)
The results you see for Week 1 above are exactly this measure for each teams performance this week. For all subsequent weeks, the scoring will be slightly different due to the added factor of strength of schedule.

Last year, I struggled a lot with how exactly to do SOS. I finally resigned to going only one team back. This means that I only looked at the previous performance of the current opponent. To be more precise though, those previous performances should also be skewed by the strength of the opponent's previous opponents. I did not do this with last year's system. The circularity becomes kind of difficult to deal with.

After struggling even more than I did last year, I found a system (doable with Excel) that allowed me to do a true SOS system. I implemented what I will call a moving-backwards-SOS-chain. I was able to avoid circularity by considering only past results and not looking into future performance. For this reason, obviously, Week 1 has no SOS factor. The Excel layout I had to configure to do this drove me nuts, and skated around the brink of circularity. However, I am fairly confident that it does what I want it to. Look at the previous opponents of the previous opponents of the previous opponents ...... of the current opponent. Always starting at Week 1 and ending at the current week. I still need to decide on a "skewing weight" for the strength of schedule. Last year, I almost arbitrarily decided on 90-100 %, meaning that the raw score would be multiplied by some number between 0.9 and 1 depending on the strength of opponent scored against.

Wednesday, September 7, 2011

College Football Map: Part II

EDIT: If you happen upon this page, please redirect to to see the newer version.

Here is a second version of the College Football map, after taking some input from reddit/r/sports

Cosmetic changes: got rid of gradients, replaced with solid color plus logo. Unfortunately, I couldn't find transparent vector images for some of the team logos, and had to use university logos. Usually it is for the smaller schools. Either way, the cosmetic changes make for a much much better looking map, I admit.

Area changes: A lot of people disagreed with my inclusion of the Dakotas within Minnesota, so I switched that to Nebraska. Again, I chose not to include any FCS teams, so the schools in the Dakotas and Montana did not qualify for the map. I also stretched Oklahoma a bit farther south. Expanded GA Tech, Colorado State, and Michigan. I changed up Indiana and NC a lot. I am still not 100% sure about a lot of the map.

Monday, September 5, 2011

United States of College Football

I ran into this map (Baseball) a while ago on Reddit, and I really liked the idea. I think later, somewhere else I saw an NFL one too. I have wanted to make one for college football for a while now, but I didn't really know how.

Recently, I had started messing around with Inkscape, and I am really liking it so far, as far as a program with which to make maps. This is my first Inkscape map.

I decided to make a map similar to the baseball one above, except with college football teams. It was hard to have a cut-off of which teams to include and which not to. At first I thought I could just use the 6 BCS conferences only. I realized soon that I had to at least include the Mountain West too, other wise that area would either be empty or some Pac-10 and Big 12 teams would have giant areas (in order to cover the whole map). So I included the MWC.

Then I ran into the problem of Nevada and Idaho. Each state has 2 FBS schools (UNLV & Nevada; Boise St & Idaho), 1 in the MWC, and 1 in the WAC. If I were to include the MWC and not the WAC, it would probably create an inaccurate picture of those states. So I decided to include the WAC too. At this point, there was really no reason not to throw in the MAC, Sun Belt, and C-USA. That is what I did. I went on to include every conference in FBS. Although I didn't count, there should be all 120-some schools on this map. 

I made the map realizing that 1) There is no perfect way to make this map, since there will be a ridiculous amount of overlap, irregular shapes, enclaves, exclaves, etc., and a truly accurate map might look very ugly. and 2) I don't know much about how the different parts of the countries watch college football. A lot of the splitting up is done by guess work based on geography and historic success of the football program. From any travels I have done, I know that most major cities are more or less split if there are 2 programs nearby. You will notice that most major cities are split. LA is split between USC and UCLA (obviously), Chicago is split 3 ways between Northwestern, Illinois, and Notre Dame. SF is split between Stanford and Cal (that one was a little hard to do, shape-wise). NYC is split between UConn and Rutgers (those are the 2 closest schools; Syracuse dominates most of upstate NY. There are many many other cities that are split.

I also did a gradient coloring for each of the areas. I didn't want to leave names on the map, because with 120 some names, it would have been too cluttered. I hope the colors and location will be enough to identify each of the shapes. I grabbed the colors from each of the football program's Wikipedia page (where available), using a color dropper Chrome-extension. It's a pretty neat tool. 

So here is version 1, quite possibly the last version if I don't want to pursue this any further. I hope I will go on to fix this up a little with better information about these distributions across the country.