Sunday, August 12, 2012

Ranking NFL Offensive Lines 2011

Introduction

Offensive lines are generally the only position set in football where there are no reliable statistics to measure value of the players. There are some superficial stats like sacks given up or total rushing yardage, but I wanted to build something more reliably dependent on the offensive line. This project is an attempt to create such a statistic, at least for run blocking. Pass blocking would be another measure that would be a lot more difficult.

The general idea for the statistic is average rush yardage for only the first 5 yards. Any run that extends beyond that would not be counted. Also, sack yardage (usually scored as negative rush yardage) would not be counted, since the line is actually going to be in pass blocking, and it doesn't make any sense to include in the statistic.

In my thinking, this is the best way to measure the effectiveness of the offensive line at run blocking. Of course, lineman often release to the second level, so they contribute sometimes beyond 5 yards, but I thought 5 would be a good cut off. Anything the running back or the downfield blockers do beyond that point is past the scope of the offensive line. Or so that's how I imagine it.

Source

In order to create the statistic though, you need play-by-play data, which is not freely available anywhere. I'm sure the agencies that do stats for the leagues do have a database of all plays.

Well, when I wanted to learn about webscraping and try it out, this is the project I took on to learn. Using ESPN's play-by-play pages for each game in 2011, I was able to gather information on every play in the NFL last year. Obviously, it would have taken quite a long time of copy pasting to have done it manually, so instead I made a program in Python to do it for me. Automatically reading the html files of each game page and recording the information in a couple of tables.

The program is hosted on Scraperwiki.com, and you can access it here: https://scraperwiki.com/scrapers/espnfootballscoresworking/

If you are familiar with Python, I invite you to play around with it or to build on it. I literally had 0 experience with Python before I started the project.

There is a lot of possibilities for different sorts of measures using a play-by-play database like this, and I will be using the dataset probably to do other projects. As you may have noticed though, the play description is copied as a whole from ESPN, and no parsing is done in Python.

I parsed the play description in Excel, to obtain what type of play it was, if it was successful, if there were penalties, if they were accepted or declined and all that. This process required A LOT of trial and error, and was more fit for Excel, where you can see your errors in real-time.

Analysis

The analysis was simple after the play descriptions were parsed. Each rush attempt on every play was broken down into yardage (negative to 5) and "gap": indicated as 1) left end 2) left tackle 3) left guard 4) middle 5) right guard 6) right tackle and 7) right end. These were decided on by whoever writes the play descriptions for ESPN.com. I then just averaged all these rushing attempts as a whole and also at each "gap".

Results

Overall run blocking


Please note that the graph is not zero-anchored, it starts at 2.25 to show better contrast.

The Saints OL appears to be the best by far on run blocking. The Panthers separate themselves from the bulk of the league as well. On the opposite end, the Falcons and especially the Titans OLs look pretty bad in this analysis. It is interesting to think that maybe the quality of the offensive line in Tennessee may have contributed as much to Chris Johnson's off year as anything else.

Per Gap Results

In the table below, the top 5 in the league at each gap are highlighted in blue, and the bottom 5 are highlighted in red.


Overall Left End Left Tackle Left Guard Center Right Guard Right Tackle Right End
League Max 3.184 3.72 3.70 3.50 3.10 3.40 3.41 3.83
League Ave. 2.872 3.12 2.85 2.90 2.76 2.84 2.89 2.87
Saints 3.184 3.72 3.13 3.41 2.67 3.40 2.78 2.98
Panthers 3.102 3.69 3.18 2.71 2.73 2.86 2.95 3.83
Patriots 3.000 3.28 2.90 2.96 2.67 3.12 3.41 2.94
Bills 3.013 3.67 3.00 2.76 3.06 3.03 2.32 3.41
Broncos 3.004 3.21 2.60 3.05 2.92 2.83 3.20 3.44
Eagles 2.986 3.61 2.57 2.35 2.83 2.70 2.90 2.91
Steelers 2.978 2.68 3.70 3.25 2.63 2.98 3.13 3.33
Vikings 2.973 3.37 2.97 2.70 2.85 3.11 3.18 2.86
Buccaneers 2.949 2.76 2.23 3.20 2.74 3.27 2.97 3.47
Jets 2.918 2.95 2.08 3.00 3.10 3.13 3.02 2.54
Cowboys 2.916 3.22 3.19 2.91 2.98 2.71 2.62 2.71
Dolphins 2.904 3.23 3.11 3.05 2.81 3.28 2.17 2.56
Jaguars 2.902 2.81 3.07 3.02 2.97 2.70 2.52 2.85
Browns 2.899 2.96 3.04 2.80 2.89 2.69 3.31 2.95
Cardinals 2.882 3.15 2.80 2.95 2.63 3.00 3.24 2.86
Ravens 2.864 3.41 2.92 2.89 2.46 2.74 3.17 2.67
Packers 2.884 2.96 2.91 2.65 2.81 3.11 3.19 2.68
Redskins 2.870 2.81 2.52 3.26 2.71 2.96 3.04 2.97
Texans 2.863 2.67 2.79 2.76 2.82 3.32 2.87 2.89
Bengals 2.832 3.16 2.89 2.73 2.62 2.48 2.84 3.15
Colts 2.818 2.94 2.32 3.06 2.89 2.62 2.89 2.88
Chiefs 2.817 3.32 3.43 3.00 2.52 2.40 2.78 2.74
Rams 2.805 3.50 2.76 2.76 2.42 3.13 2.87 2.88
Seahawks 2.805 2.20 3.06 2.82 3.01 2.16 2.97 2.41
Raiders 2.800 3.08 2.94 2.35 2.71 2.48 3.21 2.61
Bears 2.773 2.73 2.95 3.09 2.86 2.41 2.56 2.82
Chargers 2.787 3.49 3.14 2.50 2.74 2.05 2.62 2.54
Giants 2.749 2.73 2.60 3.47 2.76 2.53 2.49 2.81
49ers 2.746 2.98 3.01 2.06 2.45 2.63 2.87 3.07
Lions 2.708 3.09 2.77 3.50 2.49 3.06 2.59 2.40
Falcons 2.599 2.88 2.25 2.90 2.39 2.20 3.00 2.65
Titans 2.506 2.42 2.62 2.68 2.45 2.54 2.27 2.61

2 comments:

  1. A few ideas to get you thinking:
    (1) Did you consider looking at which plays credited DL positions with the tackle, which plays credited an LB, and which plays credit a DB? I'd say an offensive line's ability to block and open up second- and third-level runs would show in how many times the opponents' DTs and DEs make the tackle. That is, forcing LBs to make the tackle means you blocked well, and forcing DBs to make the tackle means you blocked VERY well. This lets you incorporate TE blocking and longer runs.
    (2) Playing off that idea, and looking at the gaps you identified, could you somehow score the plays' "depth" (based on which level the tackle was made) and slot those depths according to gap, to show how much penetration a team gets at each gap? The median depth would probably be 2 every time, but the average depth is likely to be between 1 and 2.

    ReplyDelete
    Replies
    1. I haven't tried that, but it's definitely an interesting idea. I believe that the play description just gives the name of the player. If I create a lookup table with player names and positions, I think I would be able to do it, but it's a huge task. Maybe I will look into it for a future expansion of the idea. Thanks!

      Delete