# Betting on opinion polls

You may have heard of PredictIt, the political futures market. It allows users to make and take bets on a wide variety of political outcomes. Several markets ask you to predict the presidential job approval ratings at a future date. The site typically relies on a poll aggregator such as FiveThirtyEight or (more commonly) RealClearPolitics. You can bet anywhere from 1 to 99 cents on the outcome falling inside or outside a certain range, and you win a dollar if you’re right.

Actually, you don’t have to bet on the most likely outcome to make money in the long term. You only have to find people offering bets with the wrong odds. For example, say I’m offered the following game: Pay 1 cent to play, then shuffle a deck of cards and pick one. If it’s the ace of spades, I win a dollar. I can play as many times as I want. Even though I expect to lose on a given round, in the long run I’ll profit because it only takes me about 52 cents to win a dollar. In other words, the game is underpriced relative to its odds.

Thus, the way to play the game is not to predict the most likely outcome, but rather to calculate the probabilities of all outcomes. Start with historical data scraped from RealClearPolitics, for about a year ending in February 13, 2018.

We can look at the distribution of daily changes, which looks normally distributed. Here I’ve plotted a normal fit over the histogram of steps.

Actually, a random walk with normally distributed steps (also called a “Gaussian random walk”) has some nice properties. If the steps are independently sampled from $N(\mu,\sigma^2)$, then the total change T steps later is sampled from $N(\mu T,\sigma^2\ T).$ That is, it’s the variance which grows linearly with time, not the standard deviation. The mean also grows linearly, as you might expect for a drift process. (Here we assume Markovian behavior; that is, that the system has no memory other than its current state.) Armed with this knowledge, we can plot a distribution on outcomes a fixed time later, say 100 days.

If you’ve made it this far, you’re probably wondering whether each step is really sampled independently from the rest. To test this assumption, we should calculate the autocorrelation function,
$\mathrm{Exp}\Big[\Delta(t)\Delta(t+\tau)\Big]$
as a function of τ, where &\Delta; is a daily change in approval rating.

It drops an order of magnitude between τ = 0 and τ = 1, and stays there. What this tells us is that the underlying system has little memory other than its current value. In fact, this defines the Markov assumption.

Finally, the question is how to place your bet. On PredictIt, the outcomes are binned, so we should integrate the normal distribution over each bin width to get probabilities. From there we can choose the event with the most favorable odds, and use the Kelly strategy to decide how much to bet.

In a later post, I should use historical data to examine the performance of this approach.

(Edit: If you’re interested, here is a similar post from 2017 by Keyon Vafa that I found sometime after writing this.)