Saturday, November 27, 2010

Power Rankings

I wanted to try my hand at creating a power rankings system. Here is what became of some ideas. The system only takes two things into account, who you played, and what were the scoring patterns of that game. Wins and losses or nothing else of the kind is factored in. Most importantly, my own biases are not factored in. I just ran the data through the machine. It took a pretty long time to enter all the data in by hand, and I am not too sure I will be continuing these rankings.

I was mostly motivated by the fact that wins and losses do not truly reflect how well a team has played. I wanted to build some sort of measure that would tell me how competitive a team has been, regardless of wins and losses. I feel that beating a shitty team by 1 point should not be so much more valuable than losing to a great team by 1 point. When you only look at wins and losses, all that nuance is lost. So intentionally, I have left out wins and losses from the formulation. Instead here is my list of the most competitive teams in the NFL. So for all games up to week 11 (not including Thanksgiving Thursday games), here is how it came out.

The number you see next to the ranking and team name is a sort of "adjusted winning percentage, which is my final measure by which I have ordered these teams.



32
CAR 0.009



31
BUF 0.079



30
ARI 0.083



29
HOU 0.101



28
MIN 0.117



27
CIN 0.220



26
SF 0.233



25
JAC 0.238



24
DAL 0.244



23
DEN 0.249



22
DET 0.280



21
SEA 0.305



20
TB 0.330



19
MIA 0.393



18
WSH 0.426



17
OAK 0.444



16
SD 0.484



15
NYG 0.547



14
CHI 0.589



13
ATL 0.657



12
KC 0.757



11
NYJ 0.758



10
TEN 0.758



9
CLE 0.792



8
IND 0.800



7
STL 0.826



6
NE 0.856



5
BAL 0.863



4
NO 0.881



3
PIT 0.882



2
PHI 0.921




1
GB 0.977

I will try to explain how the calculations worked.

I wanted to know how competitive a game was without just looking at the final score, instead looking at the duration of the game. Since the point difference only changes when there is a score, you only need to record each score and the time of the score. With that, you can see each lead throughout the game. By looking at the time difference between scores, you can know how long that lead/deficit was maintained. So for each game, a team was given a score based on its "cumulative lead", which I would define as the sum of its lead at each minute of the game (with seconds as fractions).

However, I also scaled the leads logarithmically, which means that as the lead increases, it's relative value decreases. Try to imagine it like this. The difference between a 8 point lead and a 9 point lead is much bigger than the difference between a 58 point lead and a 59 point lead. Although they are actually both different by only a point, the difference between 8 and 9 is actually valued a lot higher. I did this to actually reflect the value of a score in a football game, and consequently reduce the power of "blowouts". So that additional touchdown late in the 4th to push the lead to 35 is not actually as valuable as the late touchdown which gives you the lead.

So in this way, for each game, each team was given a "raw" score.

Secondly, I wanted to introduce a strength of schedule factor, where the "raw" score was adjusted based on the performance of the opponent in all other games. Admittedly, this could have been much more rigid, but I am not absolutely sure how to go about doing it. The trouble is that "the performance of the opponent in all other games" is also skewed by the opponents' strength of schedule, and you get into this weird web of recursion that I was unsure of how to tackle. I chose only to go one level deep, but seemingly you can just keep going deeper and deeper into strength of schedule, where you have to look at the previous opponents of the previous opponents of the previous opponents of the opponents. And so on.

Additionally, I encountered the problem of how heavily to weigh in the strength of schedule. I tried different percentages and eventually settled for a 10 percent skew, such that the "raw" score was multiplied by the "SOS factor" (90% - 100%) to give the "adjusted" score (losses had to be multiplied in a similar but different way). Changing the percentages (for example having the SOS factor run from 50% - 100%), gave slightly different results.

Finally, I wanted a final score that ran from 0 - 1, so I treated the "adjusted" scores as a normal distribution and returned the percentile of each team's score.

I don't think there were many surprises in the Thursday games this week, but I will keep track of how things pan out with my rankings.

#6 NE beat #22 DET
#4 NO beat #24 DAL
#11 NYJ beat #27 CIN

Anybody would have called those though. Let's see how the Sunday games turn out. If I had a better understanding of how to do these things, I would want to eventually create a system that would somehow return a probability distribution for future match-up, ie 5 percent chance of Team A winning by 10, 4% chance winning by 9, and etc. etc. I have no idea how to do that though.



No comments:

Post a Comment