Untangling the NFL Pt. 1: A Brief History of “Value”

What does it mean to have value? For most people, value boils down to describing the importance or worth of something on its own when isolated from other factors. From Karl Marx to Andrei Shleifer, scholars have struggled to define and measure “value” as it pertains to the biggest economic, political, and cultural institutions in the world. In team sports, arguably the largest cultural institution in the world, the same challenge applies.

Given the billions of dollars annually invested into American football, accurately quantifying a professional athlete’s “value” involves several stakeholders from franchises to players to media organizations to fans. And yet despite the unique prominence of the National Football League, as well as an industry-wide push towards analytics, statisticians have long struggled to effectively assess player value in one number.

Several factors make this tricky; first the sheer number of players on the field for each play (22). Football is also unusually volatile, as a single play can range from a team with the ball scoring six points to the opposing defense scoring six points in ten seconds. But what makes football unique is the difficulty in assigning credit to individuals on inherently entangled plays. When tight end Hunter Henry scores a 31-yard touchdown, he and quarterback Drake Maye receive credit for it in the box score statistics. However, what isn’t recorded is how wide receiver Demario Douglas made that play possible by blocking two players in the same area. Separating the efforts of Maye, Henry, Douglas, and everyone else on that same play has stumped football analysts for years.

Still – with so much at stake, it hasn’t stopped people from trying. Today’s post is the first part of a multiple-week research project I’m writing on putting player value across multiple positions to a single number. In Part 1, I’ll review recent attempts to calculate player value in American football. We might as well start with the most obvious one.

Approximate Value

In 2008, Doug Drinen, the founder of Pro Football Reference, released the first official version of Approximate Value (AV). Inspired by baseball pioneer Bill James, Drinen wanted a universal football metric that could slot players into rough categories of contribution, from gigantic to negligible. AV starts at the team level (offense, defense, special teams) and then filters down to individuals.

On offense, AV starts with a team’s points per drive relative to league average. Roughly five-elevenths of those points are carved out for blockers (tackles, guards, centers, fullbacks, and tight ends), with shares weighted by games played, position, and accolades such as All-Pro or Pro Bowl nods. The other six-elevenths are distributed among skill players. Rushers split about 22 percent of that pool, adjusted for team run-pass balance, with credit given by rushing yards and efficiency bonuses. Receivers are evaluated by their share of receiving yards, and quarterbacks by passing yards with efficiency adjustments via Adjusted Yards per Attempt.

Drinen’s approach was less theoretically grounded on defense, where he admitted to “cooking the books.” When he worked on AV, there were fewer statistics tracked to cleanly split credit between coverage, pass rush, and run defense. As a result, AV instead gives each team a defensive score based on points allowed per drive relative to the league. Two-thirds of that score goes to the front seven, one-third to the secondary, with individual defenders earning points according to games played, starts, and box score stats (sacks, interceptions, fumble recoveries, defensive TDs), with small boosts for Pro Bowls or All-Pro teams.

Special teams are treated more narrowly, with only returners, kickers, and punters evaluated. Returners get one AV point per return touchdown. Kickers are graded by Points Above Average (PAA), which compares their field goal and extra point percentages at various distances to league averages, then scales the result against a fixed league-wide pool of kicker AV. Punters are judged by Adjusted Punt Yards, compared to league average, with arbitrary 13-yard penalties given for blocked posts, and their totals similarly scaled to a fixed pool.

Taken together, AV isn’t really about universal player value as much as it is an attempt to separate player value by position. Drinen admitted as much, saying he was more interested in codifying existing domain knowledge than in inventing a grand new truth. For example, AV is built on assumptions that don’t always hold: that an offensive line’s quality matches the offense as a whole; that line play matters equally to run and pass; and that quarterback vs. receiver importance in passing is constant across teams. Box score data also varies wildly across positions: quarterbacks generate dozens of measurable stats, while linebackers may only have tackles and sacks. Furthermore, many of AV’s constants were chosen by trial and error rather than grounded empirical testing.

PAVing the way (Venkatesh, 2024)

Although AV was a breakthrough for its time, football statistics have changed dramatically since. With richer positional data and the rise of machine learning, analysts can now capture player value at a far more granular level. At the same time though, some have tried to update AV directly, including Dartmouth undergraduate Atul Venkatesh, who proposed Present Approximate Value (PAV).

According to Venkatesh, AV suffers from static positional weights, arbitrary accolade bonuses, and bias toward veterans with long careers rather than younger breakouts. In his words, AV is useful for career résumés, but not for trade value or roster construction. As a result, Venkatesh proposes a newer version of AV that offers a bit more immediate use to general managers and front offices alike.

His method combines three parts: a Five-Year Value Score (FYVS), a Draft Value Score (DVS), and an age-adjustment method. FYVS scales down AV from past seasons by 20 percent per year to emphasize recency. Meanwhile, DVS reorders draft classes by AV per season to establish expected value at each slot rather than merely the average AV of players selected per drafting position. Lastly comes an age-adjustment that applies exponential decay, cutting a player’s value by 10 percent per year after their sixth season. Together, these components yield PAV, a single figure that approximates how much draft capital a player is worth.

Venkatesh’s vision for an upgraded AV is practical and smart. It enables trade re-evaluations, comparisons of young players versus veteran stars, and insights into roster construction. But it also has clear holes. The model assumes draft picks hit average outcomes despite wide variance, ignores contracts and salary impact, and ultimately inherits AV’s reliance on limited box score stats. The result is a more polished AV – undeniably an upgrade, yet remains in the same family.

Previously in this piece, I mentioned “WAR” as a mainstay of baseball sabermetrics. But seven years ago, three sports analytics people took a swing at making an NFL version of WAR.

Is WAR the Solution?

To develop nflWAR, Ron Yurko, Samuel Ventura, and Maksim Horowitz built on two play-based metrics: Expected Points (EP) and Expected Points Added (EPA). Many readers may recognize EPA as one of the most frequently cited advanced football stats; the short version is that expected points are calculated using play-by-play data since 1999, adjusted for down, distance, and field position, while EPA measures how much a given play changes those expectations.

Released in 2021, what nflWAR adds is a system for assigning credit to specific players and then translating those credits into Wins Above Replacement. For example,quarterbacks are evaluated on air yards, interceptions, and sacks. Meanwhile, receivers earn value through expected points gained per target and their share of yards after the catch, and rushers are credited with rushing EPA, which compares actual yardage to the expected outcome given context. Furthermore, nflWAR adjusts for situation and context across all skill positions, treating a two-yard gain on third-and-two as more valuable than a five-yard gain on third-and-fifteen. The goal here is to convert plays from a given player into wins above a replacement-level player’s production in that same spot.

The nflWAR model also requires a replacement baseline, which the authors describe as the expected output of a back-of-roster player or common free-agent pickup. With that established, nflWAR can scale performance into wins above replacement level, giving a regularized scale for comparing QBs, receivers, and running backs. Because it’s anchored in plays rather than arbitrary seasonal constants, the framework is fully reproducible with situational data.

The authors openly acknowledge nflWAR’s limitations. The most obvious ones are that offensive line play is largely unmodeled, and defense and special teams aren’t included either (though I believe that modeling this for defensive players, ironically, could be positioned through using opponent statistics in these areas, with a negative weight to account for lower numbers being better). Although the authors suggest using allowed sack rates as a proxy for future modeling for offensive linemen, there’s also no opponent adjustment, which the authors claim would risk inflating or deflating a player’s contributions by adding noise.

In many ways, nflWAR represents the first serious attempt to move football beyond AV’s constraints. By borrowing WAR’s logic from baseball, it establishes a structured baseline, isolates contributions by position, and contextualizes them with play-level data. The result is a framework that feels like progress, even if still incomplete. But all of this raises a bigger question: what if the numbers themselves can only take us so far? What if, instead of squeezing plays into formulas, we just trust expert evaluations at face value?

PFF Grades

Here we get into one of the most fun and most frustrating tools for assessing players: Pro Football Focus grades. Some of you are probably already rolling your eyes either because you don’t view them as a real methodology or because you see them as more of a marketing product than an authoritative measurement of player performance. I’m not going to pretend PFF grades are flawless. But there are also a lot of misconceptions about what they are and how they’re best used.

Grades aren’t meant to be a definitive ranking system of how good players are. Instead, they’re structured subjective estimates of how well a player executed their task on each play. Every play is graded by trained analysts on a scale from -2 to +2: -2 is a disastrous play, +2 is elite execution, and 0 is a neutral outcome. These raw grades are then normalized to a 0–100 scale. As an example, the range from 60 to 80 is roughly from an average starter to a Pro Bowl caliber player, with anything underneath qualifying as replacement level or worse. Like with AV, a score for a cornerback isn’t as equally valuable as for a guard, but in either case, they illustrate sustained above average performance on a per-play basis for that position.

One benefit of PFF grades is that they factor in a level of granularity that box office scores don’t. For example, dropped interceptions are typically counted equally against a quarterback as actual interceptions, and deep passes that end in drops made by the receiver are still counted as positive plays. This level of detail that goes into PFF grades, and is outright tracked by its team via statistics like turnover worthy plays and big time throws, is absolutely valuable.

A clear limitation of PFF grades comes from their inherent subjectivity. Because graders don’t actually know each team’s playbook in detail or exactly what players are supposed to be before every play, their guess is not always going to attribute the success or failures of plays correctly. The other issue is transparency; even as a premium subscriber to PFF, I only see the final statistics compiled and the game grades that come with them. For a primarily per-play encapsulation, it’s a little frustrating to see a lack of per-play transparency.

The way I see it, PFF grades are not necessarily useful for evaluating statistical impact of a given player, but combined with the detailed statistics they come with, these grades can shine a spotlight on context for why a player’s final production may or may not be entirely indicative of how they actually fulfill their responsibilities on the field. They’re best taken as a largely useful representation of the ‘eye test’ – and an important counter-balance to the occasional over-attribution problem in more traditional and advanced statistics. In four games, EPA per play would tell you that Drake Maye’s production has been at the caliber of a Top 5 quarterback, but PFF grades are more conservative. The real truth is somewhere in the middle.

What’s Next?

Although none of AV, PAV, nflWAR, or even PFF grades came out at the same time, each of them have a role to play in the construction of a future universal ‘value’ metric. AV took a trial-and-error approach to encapsulate what most football fans already know and assume, while PAV attempted to reposition AV from a roster management perspective.

On the other hand, nflWAR represented a huge conceptual leap by anchor player value in actual play-by-play data rather than box score end results, but even its authors admit their scope was limited. Lastly, PFF grades, for all their controversy among hardcore football fans, remain one of the only ways that fans can consistently evaluate players at obscure positions where box score data isn’t available.

In 2025, no single number has done for football what WAR did for baseball. But with more advanced metrics, richer positional data, and modern machine learning, the foundation of a foundation of sorts is finally there. In next week’s piece, I’ll begin my own contribution to this new tradition by sketching a personal blueprint for a new value metric in pro football.

Untangling the NFL Pt. 1: A Brief History of “Value”

Like this:

Published by EdwinBudding

Leave a ReplyCancel reply

Share this:

Like this:

Published by EdwinBudding

Leave a ReplyCancel reply

Discover more from bignokh.com