OFFSEASON PROJECT! Hey, this is The Boy from Rock M Nation. Over at RMN, I've been playing with the `Beyond the Box Score' idea (calling it that as an ode to the wonderful SBN sabermetrics blog of the same name) as it applies to college football stats.
I've been cranking out play-by-plays for 2007 games as fast as possible--as I post for a Mizzou blog, I've focused on Big 12 teams, Mizzou opponents, and teams with fellow Heisman finalists--but this is a massive undertaking, and I need some volunteers to enter play-by-play for the games I have yet to enter from the 2007. They take about 30-45 minutes each, and I'm figuring I'll need about 6-10 volunteers to finish this stuff in a timely enough fashion to do something about it before the 2008 season starts orbiting everybody's attention span, so...actually, how about I just show you why I'm doing this?
It's fair to say that the Bill James Revolution has reshaped the game of baseball over the last 25-30 years (for some franchises, anyway...my Pirates sure haven't paid any attention). I'd go into detail about all the different statistics that have come about due to James and his disciples, but I'm going to be honest here--if you don't know/care about Bill James, then you probably won't care at all about this post.
The typical statline for a hitter hasn't changed much over the years. Chances are, when a guy is at bat, the stats they'll show on TV are Batting Average / HRs / RBIs. However, as thousands of words are spent in deciphering baseball stats each year, it's becoming more and more obvious that those categories--the last one in particular--are pretty worthless in evaluating the quality of a hitter. Who cares about a .300 batting average if it's all singles and it's not complemented by any walks or anything? Who cares about 25 HRs if you strike out 200 times and don't ever get on base otherwise? And who cares about 100 RBIs if it's complemented by a .225 average and only comes about because the guys in front of you are fantastic hitters?
Having a high number of RBIs does say something about your propensity for timely hits, but it says even more about the ability of the folks ahead of you to get on base. If the dude in front of you in the batting order has a .220 on-base %, you're probably not going to get too many RBIs no matter how good a hitter you are. It's the same way with touchdowns, really. Jerome Bettis's final season was great for my fantasy team, but his "2 carries, 3 yards, 2 TDs" lines really didn't contribute much to the team overall. I loved The Bus as much as the next guy, but just about anybody could have come in and plunged in from the 1. Getting the ball to the 1 was the much bigger accomplishment.
(Since I'm a Mizzou fan, you can probably gather that I wasn't all that impressed by Tim Tebow's "20 passing TDs/20 rushing TDs" stat that everybody was in love with since half his TDs came from a couple yards out and Chase Daniel could have done the same if he never handed the ball off either. That's true, but in the end I'd have probably voted for Tebow to win the Heisman anyway. You know...if I had a vote...)
I've been entering play-by-play for Big 12 games for the last two seasons, and I've been able to do a lot with the data...enough that I want to be able to do it with the entire country's data. The biggest thing I've accomplished so far is coming up with measures called EqPts (Equivalent Points) and PPP (EqPts per Play), based on the chart below, which shows the average number of points that could be expected from any specific yard line.
As you see, gains in certain areas of the field are worth different amounts. I did the same thing for each down.
So a 5-yard gain on 3rd-and-5 from your opponent's 39 would only be worth about 0.19 points on the first graph, but it's worth a huge 2.24 points on the second one. Stuff like that is interesting to a nerd like me. And here's the most interesting part--averaging these two EqPts figures together, and adding in a measure of turnover costliness (into which I go into more detail in my Beyond the Box Score glossary), ends up doing a pretty good job of coming up with how many points were scored in a given game.
For instance...on December 1, Missouri beat Kansas, 36-28. MU's offense accounted for 29.47 EqPts (from the first graph) and 42.98 Down-Dependent EqPts (from the second). That averages out to 36.23. KU had 30.69 EqPts and 33.95 DD-EqPts, averaging 32.32. KU also had 6.83 points worth of turnovers (again, click the link above if you want information on that), and applying half of that to KU's score and half to MU's, you end up with an EqPts final score of MU 38.98, KU 28.91. Pretty damn close. On the same day, OU beat OSU, 49-17. The EqPts score was OU 47.88, OSU 15.65.
Of course, it doesn't always calculate closely--the ball's oblong and simply doesn't always bounce the way it's supposed to--but looking at things this way gives you a lot better indication of who accounted for a team's points than total yards or TDs or yards per carry.
For now, let's pretend that PPP is the football equivalent of Slugging %. What's the football equivalent of On-Base %? We'll get back to that in Part Two...if anybody's actually interested, anyway.
And if you want to help me enter play-by-plays, either shoot me a response on this thread with your e-mail address, or contact me at BillConnelly1@gmail.com. Thanks!