How Data Science Came to Dominate Fantasy Football

Anecdotes ruled fantasy football conversations. Then the math nerds crashed the party.
Hal Koss
May 21, 2020
Updated: May 22, 2020
Hal Koss
May 21, 2020
Updated: May 22, 2020

In the summer of 2013, I took advice from Matthew Berry, the most popular fantasy football analyst in the world, for the final time.

In my fantasy league that year, I drafted running back Steven Jackson, whose career was in decline, because Berry convinced me he would bounce back with a monster season. To help explain why, Berry told a story on his podcast about how Jackson wouldn’t stay out late or party during the offseason, but he’d eat clean and get plenty of rest in order to stay in tip-top shape for the season ahead.

Surely, Jackson’s laser focus and chip on his shoulder would give him an edge on the field, helping him exceed statistical expectations.

This offhanded anecdote about Jackson’s monk-like attitude snowballed into an unfalsifiable narrative in my head. Surely, Jackson’s laser focus and chip on his shoulder would give him an edge on the field, helping him exceed statistical expectations. Being privy to this insight, I thought I had been given an edge too. So I took Jackson as the 10th running back in my draft. By season’s end, though, he didn’t even finish in the top 30.

For years, that’s what analysis looked like in most corners of fantasy football. You watched games, you read sports news, you looked at box-score stats, you trusted your gut — then you made predictions about players. Hopefully more right ones than wrong ones.

But in recent years, that’s started to change. Fantasy football is shifting from a sports fan’s hobby to a data nerd’s career. And it’s mostly due to the success of the people who infiltrated the industry with advanced metrics, sophisticated tools and the belief that the numbers tell a more accurate story than the naked eye.

 

The Early Days of Fantasy Football Analysis

To succeed in fantasy football, you have to try to forecast the future. What players will score the most points this season? This week? These are the questions you need to answer. But it’s insanely difficult. Which is why the market of websites, magazines and analysts claiming to offer the most accurate projections is so competitive.

In the early days of fantasy football, “projections were fine,” Josh Hermsmeyer, football writer and analyst, told me. But they weren’t good. “It was a bunch of folks who were very news-driven,” Hermsmeyer said. “It was who was able to collect, synthesize and apply the news in the quickest fashion, and waiver-wire pick-ups — that seemed to be the edge.”

“It was basically looking at player averages and hoping that was going to be predictive of future performance,” said Keith Goldner, vice president of data science at FanDuel. “All that was really around was what you found in the box score.”

Even so, past performance is no guarantee of future results. Especially when it’s fallible humans (and biased sports fans, no less) who are trying to separate the signal from the noise. We were grasping for straws.

“Everyone knows you’ve got to keep up on the news. It’s not an edge anymore.”

“People would talk about ridiculous and random things, and elevate them to a level of importance that they did not deserve,” Hersmeyer said. Pull-quote-worthy tidbits — “This guy’s in the best shape of his life,” “He’s due for a sophomore slump,” “He’s got bad hands,” “He’ll be extra motivated since he’s in a contract year,” etc. — would pepper news stories, and drastically alter that player’s fantasy draft position, despite the absence of data-based evidence.

Since everyone had access to the same player news and the same basic counting stats (yards, touchdowns, receptions, etc.), in competitive fantasy leagues, sometimes the only edge available to you was luck.

“As sports reporting evolved — you can find out everything, any which way — that gap in base information really tightened up,” said Michael Leone, data scientist at SportsGrid.

“The things that were helping people win were not very good moats,” Hermsmeyer added. “Everyone knows you’ve got to keep up on the news. It’s not an edge anymore.”

So, where do you find that edge? For many, it was hidden, deep in the data.

“The edge in fantasy sports, a lot of times, is taking that data and information and being able to parse out what’s meaningful, what’s not meaningful, and make projections and derive actionable information from that,” Leone said. “I think that’s why it leans more toward math people in recent years.”

READERS ALSO LIKEDIs Noisy Data Good Now? Differential Privacy Proponents Think So.

 

A scene in "Moneyball" (2011) depicting traditional baseball scouts and a data analyst evaluating players differently.

Toward Data-Driven Analysis

What factors led to fantasy football’s gradual shift from subjective analysis to a more data-driven approach? One is the mainstream acceptance of data in sports decision-making, which started in baseball a few years prior.

In 2002, the Oakland Athletics baseball team took a data-based approach to building a winning team on a shoestring budget. This method, known as Moneyball, helped sports leagues’ front offices embrace the powers of data for personnel decisions and game planning. It was only a matter of time before that way of thinking trickled into fantasy sports as well. “It’s a slow shift as people start seeing the value of this sort of data and its capabilities,” Goldner said.

Baseball — and fantasy baseball, by extension — are easier to quantify than football, mainly because of the sheer size difference of data sets; the Major League Baseball regular season is 162 games, while the regular season schedule for the National Football League clocks in at only 16. Smaller sample size typically means harder-to-pinpoint trends, less-accurate projections.

That didn’t stop fantasy football enthusiasts from trying.

“For probably three years, I did research on just what mattered in fantasy football, what helped you win. I built models to quantify and objectively look at it,” Hermsmeyer said.

A few others on the fringes of the fantasy football community, often armed with data science backgrounds instead of sports writing resumes, did the same.

And they started winning.

“It was a virtuous cycle for the folks who had those abilities,” Hermsmeyer said.

“Now you have the option of playing daily fantasy for a living, and that just fosters innovation.”

Websites such as Pro Football Focus, numberFire, Football Outsiders, RotoViz, PlayerProfiler and a few others emerged to help assist fantasy football players who wanted deeper insights than what news blurbs and box scores had to offer.

Even so, it was initially difficult for these sites to make deep-enough inroads in the larger fantasy football consciousness, which generates billions of dollars in revenue. Plus, old habits — or entrenched perspectives — die hard.

Established fantasy football personalities took notice of Hermsmeyer and others’ approach, but they didn’t take a shine to it. They would say things like, “Well, Josh, it doesnt really matter if you’re right or not. I mean, this is entertainment,” Hermsmeyer recalled. “And I was like, ‘Yeah, but I want to win my league.’”

mike clay data science fantasy football
IMAGE: Twitter

The rise of daily fantasy sports (DFS) also helped accelerate the datafication of fantasy football. In DFS — which went mainstream around 2015 — the fantasy football landscape became much more popular, competitive and rife with opportunities to win lots of money.

“As daily fantasy has grown, you’ve seen people who are trying to do this sort of thing for a living,” Goldner said. “That just fosters innovation. As sports betting is legalized, and grows, you’re just going to see more and more innovation and steps forward.”

More on Data ScienceShould You Turn Off Your Predictive Model or Keep the Faith?

 

The Data That Gives an Edge

What sort of data are these fantasy analysts getting their hands on that helps them regain that edge — that helps them win?

“The biggest thing in the fantasy football community is everything around usage,” Goldner said. “Almost all fantasy points are: The more you play, the more [points] you’re going to accumulate. So if you can predict usage and evaluate usage, that’s a huge factor.”

Leone views the analysis of usage, or opportunity, as the first wave of data-forward thinking in fantasy football: “One of the biggest leaps that occurred in fantasy football specifically is focusing on opportunities,” he said.

“If you can predict usage and evaluate usage, that’s a huge factor.”

Targets, for example, were perhaps the most immediately useful metric for measuring opportunity — specifically for wide receivers and tight ends. Targets account for how many times a pass was thrown in a player’s direction, whether it was completed or not.

Targets were not always recorded on stat sheets. Probably because only completed catches were seen as useful. On the field, maybe so. But that’s not the case when you’re trying to predict the future.

A player might be targeted 10 times in a game, say, but only catch three passes. If you’re just looking at the box score, you wouldn’t know how involved that player was in the offensive game plan; you wouldn’t be aware of his opportunity, or potential, to score.

“The second wave,” Leone said, “is looking at the quality of those opportunities.”

Because if you can properly assess the quality of a player’s opportunity, you just might be able to forecast his future performance.

That’s what Hermsmeyer thinks. He pioneered a metric known as WOPR (pronounced “whopper”), considered by many to be the most predictive statistic currently used for evaluating receivers in fantasy football. The metric is used to determine which down-on-their-luck players people should “buy low” on, because their opportunity speaks louder than their past performance.

WOPR, EXPLAINED

WOPR, or weighted opportunity rating, is a metric used to capture a receiver’s true usage and help predict his future fantasy football performance by combining and properly weighting his target share and air yard share. Target share measures the percentage of all team passes directed at that particular receiver. Air yard share measures the percentage of all team air yards directed at that particular receiver. Air yards is the distance between the line of scrimmage and the point of the catch.

This way of thinking is advantageous for people who compete in DFS. In that format, participants have the ability to pick the same players. So you’ll notice a lot of people using a player who recently had two or three high-scoring games. But since the trick to winning big in the DFS is to score points while differentiating your line-up as much as possible, you shouldn’t simply consider who’s played well lately. You should consider who’s underperformed their high opportunity.

That’s where metrics like WOPR come into play. “You take advantage of your opponents who are looking at three bad weeks,” Hermsmeyer said, because “the data’s saying: ‘No, this guy has been getting the targets. He’s a big part of the offense. Just been unlucky. He’s eventually going to have a game.’”

The key to panning for gold, it turns out, is avoiding recency bias.

 

Getting Granular

To anticipate the season (or weekend) ahead, many of these data-minded fantasy football analysts use tools to sift through years of highly detailed statistics, pull out the most important ones, and use them to construct models and algorithms.

“We break stuff down on a play-by-play level. We want to get to know the players as best as we can,” Goldner said. “Not all yards are created equal. If it’s 3rd and 15, the defense is willing to give up 10 yards. But if it’s 3rd and 1, they don’t want to give up two yards. There are situations where getting two yards can actually be better than giving up 10 yards in terms of what defense is willing to do. So you have to take those sorts of situational analyses into consideration to properly identify who the most efficient players are.”

jj zacharison tweet
IMAGE: TWITTER

Goldner says his team at FanDuel is building models to predict how players are going to perform, and building tools on top of that. The technologies they use vary, depending on what team members are comfortable using.

“For [DFS], we have optimization algorithms built in Python, because there are some good packages to do that sort of thing, so that you can take the constraints (as far as salary projections and positional roster slots) and feed it into an optimization algorithm to make your performance as good as possible,” he said.

Leone, who serves as a data scientist for a sports-betting company, has been spending the offseason building a model to better determine the play-calling tendencies of offenses. He uses a data set to run a similarity score on every play, for every team, to determine the likelihood that a team would pass or run on any play.

“Instead of using a pure run/pass rate, which could be heavily skewed by the context, now we are able to have a metric that is context-adjusted,” Leone said. “As we save more detailed data, there is more opportunity to understand the possible error in projections for fantasy football and try to project ranges of outcomes for players.”

“If you can improve your success rate by half a percent, that can have a huge impact.”

A strictly data-driven approach hasn’t totally overtaken the world of fantasy football analysis — nor should it. “It’s a tool in the decision-making process, as opposed to just the tool,” Goldner said.

Like any new way of thinking that’s introduced into a larger intellectual discourse, it’ll take time before it’s embraced by the establishment. Especially since many fantasy football enthusiasts are insistent that there’s too much qualitative evidence in the game, which numbers alone can’t account for.

And data can only get us so close to predicting the future. We can only get so accurate. At some point, there’s just the unknowable.

“I think the improvements that were made initially were fairly significant, and, at this point, you’re just making tiny incremental improvements,” Goldner said. “But that’s kind of the whole point of analytics in general — if you can improve your success rate by half a percent, that can have a huge impact. That can be the difference between making the playoffs and not.”

READ NEXTThe Oscars’ Leading Forecaster Has a Lot to Consider This Year

Great Companies Need Great People. That's Where We Come In.

Recruit With Us