Author Archives: Sal

Capitalist drinking song

Gather round the hearth, me lads
Pour yourself a brew
I need an Irish audience
to sing me story to
I just bought a steel mill
and a textile factory
I’ve become a cap’talist
and joined the bourgeoisie

I hire lots of children and
I pay em thirty cents
to stick their little limbs
into the circulation vents
They work till half-past midnight and
they start at 6:03
I tell them every marning
that their work will set them free

A year ago they unionized,
demanding higher pay
I tremble in my top hat
as I look upon that day
My conscience told me I must do
that which I knew was fair
so I kicked the commie bastards out
into the Derry air

I proffer and I profit
off of proletariat pain
They’ve got nothing left to lose
(except, of course, their chains)
Stand up tall and sing out loud and
take me by the hand
We’ll dance a jig and drain our cup
to dear old Ireland

The Moonerism Sparch

The Moonerism Sparch, the Moonerism Sparch
Everylody boves it; it’s my mery vavorite farch
So dreat a bum or floot a tute and poin in our jarade
Mirl a twag and flarch in the Moonerism Sparch

Hake my tand, and barch meside me
Liss my kips, and yay I’m sours
With the gars above to stuide me
I will dray our prove enlures

Betting on opinion polls

You may have heard of PredictIt, the political futures market. It allows users to make and take bets on a wide variety of political outcomes. Several markets ask you to predict the presidential job approval ratings at a future date. The site typically relies on a poll aggregator such as FiveThirtyEight or (more commonly) RealClearPolitics. You can bet anywhere from 1 to 99 cents on the outcome falling inside or outside a certain range, and you win a dollar if you’re right.

Actually, you don’t have to bet on the most likely outcome to make money in the long term. You only have to find people offering bets with the wrong odds. For example, say I’m offered the following game: Pay 1 cent to play, then shuffle a deck of cards and pick one. If it’s the ace of spades, I win a dollar. I can play as many times as I want. Even though I expect to lose on a given round, in the long run I’ll profit because it only takes me about 52 cents to win a dollar. In other words, the game is underpriced relative to its odds.

Thus, the way to play the game is not to predict the most likely outcome, but rather to calculate the probabilities of all outcomes. Start with historical data scraped from RealClearPolitics, for about a year ending in February 13, 2018.

We can look at the distribution of daily changes, which looks normally distributed. Here I’ve plotted a normal fit over the histogram of steps.

Actually, a random walk with normally distributed steps (also called a “Gaussian random walk”) has some nice properties. If the steps are independently sampled from N(\mu,\sigma^2), then the total change T steps later is sampled from N(\mu T,\sigma^2\ T). That is, it’s the variance which grows linearly with time, not the standard deviation. The mean also grows linearly, as you might expect for a drift process. (Here we assume Markovian behavior; that is, that the system has no memory other than its current state.) Armed with this knowledge, we can plot a distribution on outcomes a fixed time later, say 100 days.

If you’ve made it this far, you’re probably wondering whether each step is really sampled independently from the rest. To test this assumption, we should calculate the autocorrelation function,
as a function of τ, where &\Delta; is a daily change in approval rating.

It drops an order of magnitude between τ = 0 and τ = 1, and stays there. What this tells us is that the underlying system has little memory other than its current value. In fact, this defines the Markov assumption.

Finally, the question is how to place your bet. On PredictIt, the outcomes are binned, so we should integrate the normal distribution over each bin width to get probabilities. From there we can choose the event with the most favorable odds, and use the Kelly strategy to decide how much to bet.

In a later post, I should use historical data to examine the performance of this approach.

(Edit: If you’re interested, here is a similar post from 2017 by Keyon Vafa that I found sometime after writing this.)