Thinking Probabilistically

Xandamere

When we think about how some future “thing” is going to play out, human nature is to try to pick an exact outcome. This is a normal human tendency and it applies everywhere in our lives (not just in DFS), but it’s a tendency we should work to overcome in order to be more effective thinkers, predictors, and yes, DFS players.

If you want an easy example of how this plays out outside of DFS, just watch CNBC for a while. You’re bound to see some financial analysts on there who give specific predictions of where they think the market will end up by the end of the year – “we see the S&P ending 2021 8% above where it is today” and such.

We also see it in DFS. Just look at any projection system, and the single most important number in most peoples’ minds is the median point projection. Davante Adams is projected for 22 points this week while Tyreek Hill is projected for 21, so if their salaries are identical, Adams is clearly the better play! He’s projected for a whole extra point!

The problem is that this kind of thinking is just flat-out wrong. We reward predictors who get things right (if the S&P does in fact end 2021 8% higher than it is on the date of the prediction, that analyst will now be known as the analyst who “called” the S&P for 2021. This leads to more recognition, more TV appearances, etc.), but specific exact outcomes are very rarely predictable. In reality, there is a whole range of outcomes for a particular event.

Coin Flip

Let’s look at the simplest example of probability: a coin flip. Assuming the coin is fairly weighted and flipped, there is a 50/50 outcome of either heads or tails. You can call heads or tails on any given flip, but getting that call right doesn’t mean you’re an exceptional predictor of coin flipping; it just means that what you predicted would happen did in fact happen, as it should 50% of the time. But if you were asked to predict how many times the coin would land heads if we flipped it 10,000 times, what number would you predict? The “right” number is clearly 5,000 (half the time), but it’s unlikely it would land exactly on 5,000 (variance), and any intelligent prediction about the outcome of this experiment should be very close to 5,000. We’ll set the coin flip example aside and come back to see how it applies to DFS later.

Modeling an entire range of outcomes can be awfully difficult, and it’s far easier to just predict the most likely outcome; the median or the mean. In DFS, that’s exactly what projections are: projected median outcomes. This is why in tournaments we often talk about targeting “ceiling” instead of “median.” Ceiling doesn’t have a fixed definition in DFS, but it’s probably something like the 75th-90th percentile outcome (Pro Tip – Utilize OWS GPP Ceiling Tool).

Range of Outcomes

In the world of statistics, there’s what’s called a “normal” distribution, which shows how a range of outcomes can be mapped in many cases. Normal distributions are, well….pretty normal. They occur quite often. It’s probably not the case that athletic outcomes map to a normal distribution graph all of the time (in fact, they certainly don’t), it’s still a useful framework for thinking about ranges of outcomes. Here’s what one looks like:

Most of the results are clustered around the middle, which makes sense. If we have Derrick Henry projected for 20 fantasy points, a large percentage of the time he’s going to land in a range of something like 15-25 fantasy points. If you have an event that follows a normal distribution, you know the average (the peak of the hill), and you know that about 68% of the time any given event will be within one standard deviation of the average. 95% of the time it will be within 2 standard deviations, and 99.7% of the time it will be within 3 standard deviations. If Henry’s median projection is 20 points with a standard deviation of 4, that would mean 68% of the time he would score between 16-24 points and 95% of the time he would score between 12-28 points. So the question, of course, is “what’s the standard deviation of outcomes for a football player?” That’s a harder one to answer, because we can’t just model all football players together, or even all players from a certain position together. Those players have different levels of skill, offenses that run differently, different defensive matchups, etc. If we just looked at a range of outcomes for one player, we would have a very small sample size of data; not enough to model that player’s range of outcomes and standard deviation with any level of confidence.

So did I just walk you through all of this only to say “sorry, this doesn’t really apply to sports and thus it’s useless in DFS?” Of course not! While a normal distribution is not a perfect model for DFS, it’s still a pretty useful one. We may not be able to calculate it with incredible precision but we can still use it to start thinking about plays as a range of outcomes rather than just a single projection point.

Thinking about ranges of outcomes is what we should be doing because it acknowledges reality; we don’t (and can’t) know exactly how many points a player is going to score in a given game. What we can estimate with a pretty good degree of accuracy is what a reasonable range of outcomes for a player looks like. I tend to think of things as being within an 80% confidence interval, which means I don’t worry about trying to think about what the bottom 10% or top 10% outcomes are. The bottom 10% are things like “he gets hurt early in the game,” while the top 10% are things like “he has multiple 50+ yard touchdown plays,” which are basically impossible to predict. Those things happen, and when they do we just have to hope we’re on the right side of variance.

Probabilistic Thinking With Ownership

Where this starts to get useful in DFS is when we combine thinking probabilistically with thinking about player ownership. Let’s say we have Davante Adams and Tyreek Hill at the same salary again and we’re considering which one to roster. As with the example previously, Adams is projected for 22 points and Hill for 21. Let’s assume for the sake of argument here that both players’ outcomes follow a normal distribution and both have a standard deviation of 5 fantasy points. How often will Hill outscore Adams? The answer is about 43% of the time (if you want to see the math, you can use this Normal Distribution Calculator (make a copy and play with it). We’re often faced with situations like this in DFS: If “player X” projects better but will be more highly owned, while player Y projects a bit lower but comes with low ownership…what should I do?” Thinking probabilistically, in terms of ranges of outcomes, lets us approach these situations thoughtfully and objectively. If Adams projects for 30% ownership and Hill projects for 10%, then absent all other factors (such as correlation within a given roster, salary, cumulative ownership on the roster, and leverage), Hill is the better tournament play; he will outscore Adams 43% of the time but you’re getting him at just 33% of the Adams’ ownership.

The important takeaways here:

Recognize that every player on any given day has a range of reasonable outcomes. No player is a lock to smash, no player is a lock to fail.
On any given day we have no real ability to foresee where within that range of outcomes a player’s actual score will land. We could argue the RANGE of the projection is wrong, of course, but there’s still a range where a player ends up within that range is out of our hands.
Range of outcomes is a powerful tool to utilize when thinking about roster construction. It’s right up there with point projection and ownership as one of the key factors you should consider.
Thinking about the range of outcomes combined with ownership is the most important bit of the secret sauce.