# Betting on opinion polls

You may have heard of PredictIt, the political futures market. It allows users to make and take bets on a wide variety of political outcomes. Several markets ask you to predict the presidential job approval ratings at a future date. The site typically relies on a poll aggregator such as FiveThirtyEight or (more commonly) RealClearPolitics. You can bet anywhere from 1 to 99 cents on the outcome falling inside or outside a certain range, and you win a dollar if you’re right.

Actually, you don’t have to bet on the most likely outcome to make money in the long term. You only have to find people offering bets with the wrong odds. For example, say I’m offered the following game: Pay 1 cent to play, then shuffle a deck of cards and pick one. If it’s the ace of spades, I win a dollar. I can play as many times as I want. Even though I expect to lose on a given round, in the long run I’ll profit because it only takes me about 52 cents to win a dollar. In other words, the game is underpriced relative to its odds.

Thus, the way to play the game is not to predict the most likely outcome, but rather to calculate the probabilities of all outcomes. Start with historical data scraped from RealClearPolitics, for about a year ending in February 13, 2018.

We can look at the distribution of daily changes, which looks normally distributed. Here I’ve plotted a normal fit over the histogram of steps.

Actually, a random walk with normally distributed steps (also called a “Gaussian random walk”) has some nice properties. If the steps are independently sampled from $N(\mu,\sigma^2)$, then the total change T steps later is sampled from $N(\mu T,\sigma^2\ T).$ That is, it’s the variance which grows linearly with time, not the standard deviation. The mean also grows linearly, as you might expect for a drift process. (Here we assume Markovian behavior; that is, that the system has no memory other than its current state.) Armed with this knowledge, we can plot a distribution on outcomes a fixed time later, say 100 days.

If you’ve made it this far, you’re probably wondering whether each step is really sampled independently from the rest. To test this assumption, we should calculate the autocorrelation function,
$\mathrm{Exp}\Big[\Delta(t)\Delta(t+\tau)\Big]$
as a function of τ, where &\Delta; is a daily change in approval rating.

It drops an order of magnitude between τ = 0 and τ = 1, and stays there. What this tells us is that the underlying system has little memory other than its current value. In fact, this defines the Markov assumption.

Finally, the question is how to place your bet. On PredictIt, the outcomes are binned, so we should integrate the normal distribution over each bin width to get probabilities. From there we can choose the event with the most favorable odds, and use the Kelly strategy to decide how much to bet.

In a later post, I should use historical data to examine the performance of this approach.

(Edit: If you’re interested, here is a similar post from 2017 by Keyon Vafa that I found sometime after writing this.)

I’ve been meaning to learn the math behind stock trading for a while, but I’ve found it’s hard to find quality information. Most of the stuff online is (1) non-technical, (2) trying to sell you something, or (3) both. So I decided to collect my own notes on modern portfolio theory (MPT). Here’s the pdf: Notes on Itô calculus and quantitative trading.

The information comes from various lecture slides and articles. I didn’t put specific references in there, since it’s standard, textbook stuff. Just search for any piece you’d like more information about.

Here is a rough outline:

• How to select stocks, given their risk and return statistics
• How to model risk and return in the first place

The first part takes the “Minimum Variance” approach due to Markowitz. To model stock prices, I give an overview of Itô calculus (one form of stochastic calculus) and geometric Brownian motion (GBM). This is the model used by the Black-Scholes formula for pricing derivatives.

I suspect that a simple index fund might beat a portfolio selected with this recipe. In the future, I’d like to test on historical data and find out if there really is an advantage to picking your own stocks.

# Delivering bad news in O(N) time

I’ve invented a way to break bad news gradually, instead of all at once. Suppose there is a big question—“Are you breaking up with me?” or “Do I have hepatitis, Doc?” The traditional algorithm is decidedly O(1); the girlfriend or the doctor simply says “yes” or “no,” and the news is broken. It would be nice if we could delay the news, so that the answer became gradually more clear as time passed. Here’s a procedure to do just that.

At each timestep, the doctor (say) flips a coin and hides the outcome from the patient. If it is heads, he simply says “heads.” If it is tails and the patient has hepatitis, he says “heads.” If it is tails and the patient does not have hepatitis, he says “tails.”

Let’s analyze this from the patient’s point of view, supposing that both answers start out equally likely in his mind. That is, $\mathrm{P(hep)}=\mathrm{P(no\ hep)}.$ Suppose there have been N timesteps. If the doctor ever says “tails,” then the patient knows he’s in the clear. So the interesting question is how the patient’s degree of belief changes when the doctor has said “heads” every time for N timesteps.

Using Bayes’s theorem and some algebra, you can show that $\mathrm{P}(\mathrm{hep}|N) =\Big[\mathrm{P}(N|\textrm{no hep})+1\Big]^{-1}.$ In order to get N “heads” responses given no hepatitis, the coin would have to land heads-up N times. And we know that has probability $(1/2)^N.$ After a line of algebra, we get

$\mathrm{P}(\mathrm{hep}|N) =\frac{2^N}{2^N+1}.$

This approaches 100% as N tends toward infinity, which is what we expected. On the other hand, if the patient doesn’t have hepatitis then we expect a “tails” to come up after only 2 timesteps.