Which Model is the Fairest of Them All?

The tl;dr:

-Models output predictions for each state

-Models also forecast median margins of victory; it is better to use these to compare one model to another

-When looking at national forecasts, the relationship between popular vote MoV and win probability is distorted by the electoral college

Probabilistic forecasts only get to live in one reality

Yesterday, pollster Patrick Ruffini posted a facially reasonable take on the state of election forecasting:

This is a subtweet of the epic Nate Silver – G. Elliott Morris debates on model uncertainty (they were going at it hard last night); Morris’ model has Biden at 90%, Silver’s model isn’t out yet but is expected to be much lower, perhaps near 70%. Ruffini is effectively saying “Chill guys. If Joe wins we’ll never know which of your models was the best because reality only gets to play out once.”

So is there no way to differentiate the skill of a 90% forecast from a 70% forecast?

It’s absolutely true that election modelers are predicting a singular event. The forecast is made, the election happens, and that’s that. We don’t have the luxury of peering into multiple parallel universes to see what happens in 100 of them to see whether Biden really did win 90% of the time or if it was only 70% of the time.

Yet hope is not lost. Whatever future timeline we end up in, we should nonetheless be able to say something about which forecast was better. If a forecast implies Biden is extremely likely to win and he only barely wins, that forecast was worse than the one that said it would be close. Consider 2016: several models had Clinton at 98% or better, Nate Silver’s had her only in the high 60s and low 70s. Suppose that instead of Trump, Clinton had barely held on in MI/WI/PA and won. Were the 98% forecasts better? They’d have a better Brier score! But no, they would have still sucked. Why?

There’s a relationship between win probability and margin of victory

This is intuitively obvious but worth exploring. Let’s take a Georgia. If Georgia is a true toss-up, then we shouldn’t be expecting either Joe or Trump to win by 5 points. Maybe sometimes that happens, but more often than not the margin of victory is under 1.5 points either way. Maybe Trump is slightly favored there (not sure I agree, but let’s say he is), but there’s enough uncertainty we can’t say for sure that he’ll win. Maybe he’s worth 60c. Here’s a quick and dirty MSPaint diagram of what’s going on with margin of victory in a 60/40 race, as an example:

We have some sort of median forecast margin of victory (Republican by 1.5 points), but we’re uncertain! Polls have margin of error, so do polling averages, and so does our forecast. The most likely outcome is R+1.5 but it’s not the only outcome. The full probability distribution is shown with the example bell curve above (note that I have no idea how accurate this distribution is). We get the win probability from how much of this probability distribution lies on the R side of the spectrum and how much lies on the D side. Here, at -1.5 points, about 60% of the area under the curve belongs to the R candidate while 40% belongs to the D candidate.

Models are in the business of generating these probability distributions and as a result they all tend to output both win probabilities and margins of victory (others go further and suggest turnout numbers, vote shares, and so on), which allows us to compare the two measures systematically:

In this plot, each point is a particular state forecast (these span all states included in my 8/1 Contest). On the x-axis is the forecast margin of victory (from losing by 30 points to winning by 30 points) while the y-axis shows the win probability. There are two data types shown here: modelers are shown in dark orange, and individual contest entrants (humans) are shown as grey points.

Let’s take just the modelers for a moment. You can see that if they think the margin of victory is near zero, they think that the race is a toss-up (near 50% win probability). And once the margin gets out to +4.5 points or so, the race is already 75%. Note that I’m including all modelers here in dark orange and yet there’s comparatively little spread overall (perhaps at a later date I’ll break them down individually to see what if any differences there are). And while 538’s model isn’t included here (it hasn’t been released yet), the overall curve very closely tracks what their 2018 midterm model showed:

The X-axis range is wider here; but the overall curve is very close to what’s shown above.

Margins > win probabilities for scoring an election forecast

Since it so happens that all modelers agree on the general relationship between win probability and margin of victory (more specifically: so far no modeler has a substantially steeper or flatter curve than any other), it’s fairly straightforward to simply compare them on median forecast margin. Whoever gets closest did the best! (And this is why margins are the dominant scoring factor for my contest). This is wonderfully intuitive: if your forecast is calling for a close race, then the race ought to be close. If it’s a blowout, your forecast probably sucked.

What about national forecasts?

How does margin relate to the overall chances to win the presidency? Thanks to the electoral college, this is not as straightforward. Voters of different parties are not evenly distributed in all the states and as 2016 illustrated one party can thus lose the popular vote while winning the electoral college. A 1.5 point loss in the popular vote for Trump is probably an 80% win chance for him.

Further, this electoral college edge moves from election to election (as individual states drift one way or the other). We can’t know that Joe by 2.1 is the 50/50 line like it was in 2016: the line may well have drifted up to Joe by 3 (as suggested below). Dave Wasserman has argued in the past that it’s even conceivable that Trump can win some elections where he loses the popular vote by 4-5 points.

Where are we now? Based on all contest entrants and modelers, the 50/50 line is somewhere around 3.1 points (where the linear fit to the data below crosses the y-axis; despite this curve being almost entirely generated by human contest entrants, G. Elliott Morris’ model has 3.2 as the 50/50 line):

Translating this into win probabilities across the whole range is a bit trickier – I’d love to see a modeler put out a simple chart of pop vote margin vs win probability (perhaps Nate Silver will give us these data). But roughly speaking: +3 points is probably around 50/50, +4-5 is 60-70, +6-7 is 75-85%, +8-9 is high 80s low 90s, and +10 and up gets to be academic.

To sum up

Yes, we’ll be able to tell 70% and 90% forecasts apart. A 90% national forecast is going to have a much closer TX/GA/IA/OH margins than a 70% national forecast. A 90% national forecast implies a much higher popular vote margin than a 70% forecast, and so on. So fear not, we will know whether Nate Silver’s model is the champion this year or whether The Economist’s can dethrone it (or if any other modeler beats them both). And if you’d like your shot at beating them all, start thinking about your entry into my next contest (entries will open August 25th and close September 1)!

The Predictions Are In

For the first round of my 2020 Prediction Contest, we had an amazing 98-entry turnout (93 of whom revealed themselves publicly and are prize-eligible). Thank you to all who entered, and best of luck! You can view everyone’s entries here (some typo correction was required to fix where people left off a minus sign or other mistake here and there; raw entries are here).

Contest Entrants Think Trump is Going to Lose

Of the 98 entries, only six said that the Democratic candidate was under 50c to win the presidency. In fact, the median predicted probability for the Democratic candidate (hereafter Biden, for brevity) among contest entrants was 87%; notably, the median probability given by the models out there is also 87% (though Nate Silver has yet to release his and based on his fights with G. Elliott Morris it’s probably not going to come in very high).

This confidence in Biden (he is “Likely” to win the presidency, in the parlance of verbal handicappers) isn’t shared by the markets. Both BetFair and PredictIt price Joe in the low 60s, suggesting that the presidency “Leans” Biden (the price ranges on PI depending on the market you’re looking at). Even Silver thinks that’s too low: “I don’t think people realize how dumb and sometimes even irrational the prices are at political betting markets as compared to almost every other type of market (which is not to say other markets are always rational, either).” and “Too low on Biden.”

He’s right! And he’s right to note in that thread that often the highest-profile markets are the most mispriced. Let’s talk about it.

The Markets Are Wrong in Two Ways

  1. Joe Biden is underpriced (by at least 10-15c).
  2. The markets are inconsistent, often wildly.

And it’s this second error that’s the most interesting to me. We know, for instance, that one of the reasons Trump trades higher in the main USPREZ market is that that’s the first place casuals go to plunk their money down on their man (and there are more Trump bettors than there are Joe bettors).

Yet the markets also have Biden comfortably ahead in MI (75), WI (70), and PA (73) which Trump needs to win to have any chance. These numbers are all still underpriced according to the models we have, but they nonetheless imply a higher win probability for Joe than do the main markets. Other markets have even more striking discrepancies: the electoral college market has Biden winning the EC 75% of the time! The popular vote market has Biden getting more than 4.5% (enough to win 95% of the time) at 71c!

In the latter two cases, some of that is internal overpricing (the markets add up to way over 100c) caused by long-shot bias, dart-throwing, and market making overwhelming the neg-riskers. Still, it’s notable that people are willing to pay something like 9c that Biden wins Alabama (lol) while others are willing to pay 7c that Trump wins California (looool). Even aside from the long-shot biased states, Biden is worth well over 60c overall based on his price in MI/WI/PA/FL/NC/AZ. So what happens next?

Something to watch over the coming weeks, particularly with the back-to-back conventions, is whether Biden’s overall win odds increase or whether the state markets simply get Trumpier. My intuition is the latter (already, the price in places like FL/IA/GA has moved about 4c to Trump this week despite moving only 1-2 points his way in the models) as dumb money begins venturing out from the main attraction. Anyone who saw 85c Clinton in California in 2016 knows that things can always get stupider…

Contest Entrants Were More Comfortable with the Main States

When you look at the range of predictions for MoVs for a given state, a tighter range indicates that most of the entrants have a good idea what a particular state will be. Georgia, for example, has a reasonably tight spread. People know that Georgia is going to be close.

Georgia’s MoV with modelers (orange), handicappers (yellow), current polling averages (blue), PredictIt (teal), and contest entrants (black: mean/median; white: individual entries).

Sure you have one person firing a +10 for Joe in GA (@StockJabber, who made Blue Tsunami picks), but by and large everyone knows that GA is going to be within 5 points of a dead heat either way. Contrast that with ME-02:

ME-02’s MoV with modelers (orange), handicappers (yellow), current polling averages (blue), PredictIt (teal), and contest entrants (black: mean/median; white: individual entries).

Most of the points are clustered again from -5 to 5; yet there are quite a few that now extend further out. Overall, the district is still priced as a toss-up by contest entrants (as do the modelers, the markets, and the handicappers), but the spread suggests, to me, that a lot of entrants were simply guessing. ME-02 isn’t talked about as much as places like GA or TX or FL, it’s not something everyone knows the 2016 margin in off the top of their heads.

You can also see this if you compare forecasted national MoV and a given state’s MoV. In better known places, there’s a clear tight linear relationship between what someone said the national MoV would be and what they said the state MoV is. In the lesser known states, things are noisier.

What do I take away from this? Well, if the contest entrants (who are sharper in aggregate than the markets overall based on their pricing) are this uncorrelated for places like ME-02 and MT, then that’s where I’d expect to see softer pricing and nice opportunities in the next few months (and perhaps and especially on election night as well). Something to keep in mind…

More TK

There’s a lot more to say about the contest (including the winprob-MoV curves and how I’m imputing winprobs from verbal handicappers and so on) but this blog is already long enough so I’ll save some for later. In the meantime, check out this thread, and hope to see you all in the second contest in one month!

2020 Election Prediction Contest

Please come win my money!


No purchase necessary! Fill out the form above with your predictions on the presidential election in 2020 and whoever has the best predictions will win $125 from me 🙂

***UPDATE: Due to a generous contribution from anoland, the prize is now $250 per contest at minimum! (He did not receive any special insight into anything for his contribution and remains prize-eligible; I’m the only one who can see entries while they’re being submitted and am thus prize-ineligible)***

First contest: Entries dues 11:59:59 pm eastern time, August 1.

What is this?

This is a public contest to see who can best predict the outcome of the 2020 presidential election overall and in 23 key states. To participate, you must provide a win probability and margin of victory/defeat for the Democratic candidate in each of 23 states plus overall national win probability and electoral college/popular vote margins of victory.

Whoever makes the best predictions wins my $125 (or, if I win, whoever the runner-up is gets my $125).

Wait, why are you doing this?

For 2018, I maintained a spreadsheet that showed a simple comparison between predictions for various modeling organizations, betting markets, and expert handicappers (like Cook Political). I want to do something similar this year, but this time also incorporate individual predictions from all you sharps that follow me and read this blog, plus anyone else in election twitter that happens upon this contest. And I figure in order to get people to do it, I ought to offer a prize! In addition,


Yep, for bragging rights you’ll get to see your entry next to all the modelers and everyone else’s that participates (all predictions will be revealed after entries close). If you’d like to be eligible for the prize, you’ll need to give me your twitter @handle, otherwise your entry will be anonymous. Once the final results are in (I will be waiting until the official FEC report comes out in early 2021 to resolve and pay out the contest), you’ll get to find out if you beat the markets, the handicappers, the average of your peers, or everyone!

When are entries due?

There will be four contests, this post is appearing ahead of the first. Entries for the first contest are due by 11:59:59 pm Eastern time, August 1. Subsequent contests will launch at later times and will be due on September 1, October 1, and November 1. Each contest is separate, so you can enter all four and theoretically win $125 from each. (And by running multiple time points, I’ll hopefully get some fun data for how all of you are or aren’t changing your predictions over time).

How do you determine who wins?

The best predictor wins. In this case, that means the entrant who earns the most points wins. How do you earn points? Because entrants will be making two kinds of predictions (win probability [winprob] and margin of victory [MoV]), each will be scored separately, then combined at the end with four times the weight given to MoV points. (Click here for the scoring rules with an example). Here are the formulae:

  • Winprob predictions: if Dem candidate wins, points = predicted probability; if Dem candidate loses, points = -predicted probability.
    • EXAMPLE: You predict the Dem candidate at 85% to win the presidency and 40% to win Texas. The Dem does win the presidency, but loses Texas. You are awarded 85 points for the first prediction and lose 40 points for the second prediction.
    • Your goal therefore is to maximize winnings and minimize losses, and if you want you can try assigning 99s and 1s to everything where you think the Dem wins or loses – but if you’re wrong you’ll be wrong bad. In theory simply offering an honest probability is the best strategy.
    • All winprop prediction points are summed, giving you the Winprob Subtotal.
  • MoV predictions: points = 10 * (5 – a*(abs(actual-prediction))
    • a = an adjustment factor. For all % point MoVs (all but one of them), a = 1. For the Electoral College MoV, a = 0.18 (this is roughly 100/538 in order to normalize things).
    • abs() is the absolute value function.
    • The formula says that you get more points the closer you are to the actual margin of victory, with a maximum score per prediction of 50 points [10*(5-0)]
    • It also says that if you’re more than five points off the actual margin of victory (or 28 electoral votes for the EC MoV) you will lose points for that prediction (the further you’re off, the more points you’ll lose).
    • All MoV prediction points are summed, giving you the MoV Subtotal
  • Final points = 4*MoV Subtotal + Winprob Subtotal
    • Yes, the overall score is weighted more heavily to MoV predictions. This is deliberate to avoid people trying to win the contest by spamming 99s and 1s for winprob predictions and hoping to get lucky.
    • Whoever has the most points wins.


To be eligible for the prize, you must submit all predictions as well as your twitter @handle via this form (same as above). One entry per contest per entrant. Your twitter account will need to have been created prior to July 1, 2020. If I receive multiple entries from the same entrant, I will reach out to that entrant to find out which entry is theirs; otherwise the first entry made by an entrant will be the one that counts. If you find an entry on the list in your name and did not make an entry, please contact me and I will remove your name from that entry.

There is no fee for entry and you can enter anonymously (your prediction will still appear, but with no name attached). Anonymous entries will not be prize-eligible.

State of the General Election: June 2020

Where things are

Joe Biden holds a commanding lead. It’s not inconceivable that it will fade nor is it impossible that Trump can win. But if the election were held tomorrow Joe Biden would win in an utter rout with Trump defeats in IA, TX, and GA quite plausible if not probable. Joe winning with 400 electoral votes would be a real possibility. Hell, there was a (completely trash) poll out of Arkansas today with Joe trailing Trump by only two points with a 45% share of the electorate (the AR market hit 13c for Dems at one point today, lol).

RCP has Joe up by over 8 points, and with an impressive 49.8% of the national electorate (Clinton was up by 5-6 at this point in 2016 with a 44% share – she would peak post-primary in RCP at 7.9% and 47% share in mid-August; her highest share would be 49% shortly after Access Hollywood). While we wait for Nate Silver to release 538’s official average (soon, apparently), I’ve constructed a simple polling average of all unique polls from the past two weeks from their expertly-maintained database: Joe is up by 8.4 with a 49.4% share while Trump’s share which was in the 43s pre-corona is now mired in the 40s and 41s and sliding further recently.

Trump’s share of the electorate slipped during the peak of corona, it has further eroded during protests. Biden’s share has waxed and waned and is currently dangerously close to exceeding the 50% line.

Where things are going – short term

Despite Joe’s commanding lead and most modelers having him 70-80% to win in November (I price Joe at 71c), the big markets are reluctant to go that far. We’ve drifted from 50/50 to 57/43 before running into some more Trump demand or reluctance to push further. There are a couple reasons for this: there is a widespread perception that “the polls” were wrong in 2016 (they were, on the whole, very good with the notable exception of those that didn’t weight be education in the upper midwest); and there’s the sense that the recent slate of big Joe polls are only so big because of the events they’ve coincided with. Once nationwide protests fade, or the virus does, or the news cycle changes as it has in the past, so too should Joe’s lead shrink. So why buy in now if he might be cheaper later?

I don’t entirely disagree with this line of thinking; in general you make more money walking with the herd of brain-dead zombies than fighting them and sitting on underwater shares for months. But there are big problems here nonetheless worth mentioning:

(1) Joe is worth more than 58c right now even if he were leading by “only” 6 points – he’s probably not worth less until his lead falls below 4 points;

(2) There won’t be that many polls in the back half of June because so many pollsters went hard in early June and early July will also likely be dry with the holiday so people hoping for a quick Trump flip might be waiting for a bit;

(3) If we’re coming into August and convention season with Joe +6.5, he will likely be in the 60s even after factoring in MAGA zombie stupidity;

(4) There are other places where you might want to make this short term play like IA, OH, and TX where the bettors are much more aggressive for Biden than they are in the overall market – after all, Joe is probably losing these states if he’s +5-6 nationally and even the diehards know it.

Even with mean reversion in the polls, Trump faces an uphill struggle

His success in 2016 was rooted in many things ranging from the late October Comey revelations to macro trends in manufacturing to an ability to exploit racism to Hillary Clinton being an utterly reviled figure on the right (and not particularly loved by the left). So while it certainly came as a shock that he did win, it’s worth noting that he only eked it out.

What factors could play to his advantage now?

“Trump has barely gone negative on Joe, the polls will move as his campaign spends money nuking Joe over China.” Joe is not Hillary, nor will any attempt to “define” him make him Hillary. Trump will try, but will not succeed to the degree he did in 2016.

“Maybe the economy will rebound sharply, and voters will be sensitive to the first derivative.” Or maybe it won’t? Or maybe they won’t? Or maybe this whole line of reasoning is desperately speculative what-iffery until we actually see how the polls move?

“Polling error still exists; many polls are still not weighting by education and its probable the national popular vote – electoral college gap has grown. Trump need only be within 3-4 points to be win. He could even win down 5.” This is somewhat true. He certainly is at least 50/50 for any election where he’s only down two points, for instance. But he has to get to that point, which means he’s gotta start polling around 45-47% share and he’s not shown the ability to stay at that range yet. It’s also worth pointing out here that Clinton’s pop vote advantage owed in significant part to the distribution of hispanic voters in CA/TX/etc and that Joe is not doing as well as she did with these voters, despite being up 8 nationally.

“White people voted for Trump at least in part because of a fear of replacement by brown and Black communities and a perceived loss of status associated therewith. This hasn’t changed.” Yes. I agree that this is probably Trump’s best and only play down the stretch. It’ll be back to “caravans” and “thugs” and “The Snake” and other such simple plays to racism. It’s his most potent weapon and it’s where his mind goes naturally anyway.

The virus isn’t going away either

I wrote in early April:

The United States will fail to develop a testing/tracing program needed to contain the virus by the end of May.  The curve will bend, but new cases will continue throughout the summer.  Because the virus spread so readily and from asymptomatic or minimally symptomatic people, our testing regime will only catch cases that make it to the doctor’s office.  Even if we implemented extremely widespread testing, it won’t stop the virus.

And this isn’t too far off the mark. Our testing is quite a bit better now, but tracing lags far behind in most places. And there is no national strategy past “idk, everyone figure it out for yourself”.

Despite the persistence of the virus, quarantine fatigue has set in, coinciding with reopenings, mass protests, and a cultural reticence among places that were relatively spared initially to adopt practices like mask-wearing. As a result, hospitalizations and cases are rising throughout the South and West (increased deaths will follow in a few weeks). If any of these states start seeing big numbers (>10k cases per day, >250 deaths per day), Trump is due for another poor spell of virus news. (Someone will get the virus at one of his rallies at some point too; the media is absolutely champing at the bit to trample him on this story).

So this is going to be a boring blowout election then?

Yeah, probably. Stuff can and will happen. I’d be mildly surprised if we didn’t have at least one period of panic over Joe trailing Trump in polls or something (I’m penciling in early September, immediately post-RNC, for one likely time point. There will probably be a bit of hand-wringing before the first debate as well). Joe could get the virus (he’s not having people he meets with tested), in which case there will be utter pandemonium in the markets. So could Trump, if someone’s cough at one of his indoor rallies finds the perfect air circulation pattern. Maybe someone will figure out why Trump was spirited away to Walter Reed last fall, or maybe Joe will suffer a health scare like Hillary’s pneumonia. Given major shifts to vote-by-mail in the fall and the virus, perhaps the entire election itself will become a legal clusterfuck.

But given what we know about the race right now, our strong Bayesian prior has to be that despite the coming ups and downs, Joe is a favorite to beat Trump, and that a Joe blowout is just as if not more likely than a Trump win.

Preparing for the General Election

Somewhere out there amidst the slow-motion quasipocalypse around us exists the beginnings of a general election campaign between (as yet presumptive) Democratic nominee Joe Biden and the “I’m not a doctor” incumbent President Trump.  What’s going to happen?  Who knows!  But as I’ve said before (most recently on a panel at a political prediction conference that Flip Pidot put together), I tend to trade these things by reacting to the present and making plans for how possible future events could influence market prices.  For example, in the short term, I’m wondering if the next set of Trump v Biden polls will continue to push Joe higher in the various state markets (and the winner market).

But for now let’s set all the intervening drama between now and November 3 aside and skip ahead to the fun part: election night.

The Motivation

I kind of suck at elections.  There, I said it.  Not to say that I lose money on them, just that I frequently don’t make as much as I should, and it’s easily the softest part of my game.  I have a good overall sense of political geography, but not the fine-grained understanding where I know basically what every county in a given state is doing.  And on election nights, it REALLY helps to know that stuff.  If you’re playing a turnout or margin of victory market (or even a winner market when it’s close), you’ve got to be able to estimate quickly and reliably how much vote is remaining and how that vote might fall.  This requires work, and often for big multi-contest elections I never do enough of it and end up just sort of clicking buttons.  This post is hopefully the first in a short series detailing what steps I’m taking to shore up my play in this arena.

The Goal

Wouldn’t it be great to be able to read the results coming in from 7:30 – 8:30 on election night and be able to predict how states that report later will break?  Is it possible to take extremely early results from, say, random counties in Kentucky/Vermont/West Virginia/Indiana that report first and extrapolate something useful from them?  Basically, can I build a better, faster, and more profitable version of The Needle?

The Hypothesis

There’s a lot that goes into live-forecasting an election.  For one, you need to be able to treat absentee/early vote different from election day vote.  You also need to know how homogeneous the precincts are in each county.  You need a sound method for modeling uncertainty around your prediction.  You need to have scrapers working for all the various results providers (NYT/AP, DDHQ, and CNN/Edison) and a way to select the freshest results among them.  But after all that, a basic level you need some way to model how one county’s reported results influence what another county might report.  Is this possible?  In order for it work, my hypothesis is that counties that are similar to each other demographically, by size, and in terms of past electoral history ought to continue behave like each other in elections going forward.

Exploring the data

So far, I’ve not systematically approached this idea – my goal in the early going is to build a rough spreadsheet that I can use to check my ideas before more rigorously testing them.  (And if it doesn’t work at all, at least I learn some more political geography on the way).  What I’ve done is to gather data on each county in the country (demographic and political), then simply do some linear regression on a bunch of variables to see if I can’t find, for any given county, the most similar counties to it in the nation.  This is what that looks like:

Juneau County Wisconsin part one.png

Juneau County Wisconsin part two.png

The top row of the spreadsheet shows where I’ve typed in a target county: Juneau County, Wisconsin.  The rest of the spreadsheet then does some vlookup() magic to pull out all the counties that are closest to Juneau County across several dimensions.  Here, these are shifts in margin and turnout from 2012-2008, 2008-2004, 2004-2000; total dem share and total turnout in 2012, 2008, 2003, and 2000 (I use log(turnout)); and racial demographics (fraction white/black/native american/asian/other/latino).  In this example, I am NOT using any information about 2016 – yet you can see that the average of the ten closest counties to Juneau County end up doing a pretty good job predicting the swing in Juneau from 2012 to 2016.  (Note I’m also not doing any weighting on physical distance of counties, yet this emerges naturally from the data – so far turning off distance as a factor doesn’t seem to make a huge difference but I’ll have to explore it further).

Here are some other sensible groupings it makes:

Hamilton County Ohio.png
Hamilton County, Ohio is home to Cincinnati. The spreadsheet finds Jefferson County, KY as its closest relative (home to Louisville, about 100 miles from Cincinnati) along with a lot of other midwestern/southern cities, interesting as Cinci is sort of half-midwestern and half-southern.


Buffalo County SD.png
Buffalo County, SD, is a Native American majority county, and the spreadsheet correctly pulls out a bunch of other Native American counties as a result (one thing I didn’t know about 2016 was how much worse Hillary did in these counties compared to Obama).

Cobb County Ga.png
Without consideration of 2016, the spreadsheet struggles to find the closest counties to this growing suburban Atlanta county.  Still, not terrible.

Pasco County FL.png
The 2016 shift in Pasco, FL isn’t entirely predicted by the 2016 shift in its nearest neighbors determined by data from 2012 and earlier. (a red county that went hard red in 2016 – I’ll always remember it because Steve Schale tweeted something about how poorly things were going there and it was the first time I realized I was going to lose a lot of money on Hillary Clinton).

Sometimes it doesn’t work

As you can see with Cobb (GA) and Pasco (FL), the most similar counties based on info from 2012 and earlier don’t necessarily shift to the same extent as the target county.  This is particularly true for bigger Obama-Trump counties in the upper midwest (smaller ones, like Juneau (WI) are predicted well as shown above).  Here, for instance, is Macomb County, MI.  First are shown the most similar counties excluding all 2016 information, and next what the most similar counties are with 2016 included.

Macomb County MI 2012.png

Macomb County MI 2016.png
Obviously if you include 2016 numbers you get a much better neighborhood of similar counties.

So what’s next

Well, it’s cool that it sort of kind of works.  But it also fails to pick up on the magnitude of some of the 2012-2016 swing for both Obama-Trump midwestern counties and, to some extent, for suburban counties everywhere.  I can try adding additional weighting schemes, or maybe simply adding midterm data (though if I’m doing national correlations I’d rather not have different inputs in different states).  Perhaps there’s some additional category I can consider weighting on (education comes to mind – maybe I can find some measure of urban/suburban/exurban/rural?).

Finally, finding the closest counties is just a test of the overall approach.  In principle, counties that are quite distant from a given county on any of these metrics could still provide useful input.  We shall see!

Corona Predictions

First, some disclaimers:

I’m definitely not an epidemiologist.  In fact, I’m in that dangerous category where I know just enough about a subject to get me in trouble (if you’d like to read an actual epidemiologist’s take, here’s Lipsitch et al on the topic of what comes next).  I’m also partial to doomerism even in the good times, and the three weeks (years?) of social distancing certainly haven’t ameliorated that tendency.  While I’m in the business of making predictions (literally) that doesn’t necessarily mean that I’m good at it or that the skill in one domain (politics) transfers to others.  But, you know, screw it.  Here are some bleak predictions and let’s hope I’m wrong:

The outbreak in general:

1. The United States will fail to develop a testing/tracing program needed to contain the virus by the end of May.  The curve will bend, but new cases will continue throughout the summer.  Because the virus spread so readily and from asymptomatic or minimally symptomatic people, our testing regime will only catch cases that make it to the doctor’s office.  Even if we implemented extremely widespread testing, it won’t stop the virus. Look at South Korea with their top-line testing program – the outbreak is still trickling along.  Look at Singapore, where the government just announced another lockdown because despite their best-in-the-world contract tracing program they were unable to identify the source of half their new infections.  Look at Hong Kong, with their efficient use of centralized quarantine that will, if ever, only be haphazardly implemented here and would probably raise constitutional issues.  Look at China, where extremely severe measures were needed to contain an outbreak on the scale of New York (that New York is not taking) and where still life has not returned to normal as dozens of cases continue to sporadically emerge (both imported and community spread).  You want to tell me America is going to do better?  Nope.

2. Some cities that are spared in the initial wave in the United States will become hot-spots later.  Rural and small-town communities will see periodic outbreaks that may briefly overwhelm local hospitals (as happened in Albany GA).  Look at the incredibly diffuse spread of the virus in the United States.  It is not going away until half of us have been infected or a vaccine emerges a year from now.

3. By the end of August, there will be estimates based on serological surveys that up to 30 million Americans have been infected.  This is not nearly enough to confer herd immunity and we will still be quite vulnerable to a second wave.

4. There will be a second wave in which peak deaths/day nationwide exceeds 250 or more. It may not be as severe as the first as metro areas institute lockdowns and closures earlier.  Though it’s worth noting the second wave in 1918 hit much harder than the first.

5. In fact, some cities may see multiple waves.

6. The timing of the second wave will depend on the seasonality of the virus and how much restrictions are relaxed and how much better we get at contract tracing, but if it’s anything like 1918, the peak hits literally on election day.  This will cause chaos.

7. Alternatively, it could be barely beginning in late October/November.  Everyone will go to the polls nonetheless, which will accelerate its spread, leading effectively to a nation on lockdown again and a canceled Thanksgiving and Christmas.

8. You will continue to see examples of people violating social distancing norms in various places.  People will begin to turn on each other over this.  Others will get tired of the distancing.  Regardless, our uneven compliance will only be enough to slow the spread but never enough to stop it.  Remember too that many people still need to work and are still using mass transit every day, etc.  The poorest neighborhoods in NYC are the hardest hit so far.  Continued iterations of lockdown and relaxation will persist throughout the year.

9. Sports ain’t happening.  Sorry, sports bettors.  No NBA.  No MLB.  And yes, the NFL is going to see a much shortened season due to the fall wave.  Quite possible there will not be an NFL season at all.

10. People will not want to fly any time soon.  Airlines will require another bailout (and sooner rather than later).  This will be controversial and will collide with the political season.  A major US carrier may go bankrupt.

11. Greenhouse gas emissions will decline.

12. The iterative stop-and-start of social distancing will deepen the severity of the global recession (and prolong it).  Additional stimulus will be required.  At least one more package will be passed.  It too will be inadequate.  Suffering, in the United States, will be far greater than it needs to be as a result.

13. By the end of the pandemic in the second half of 2021, nearly all Americans will know of a friend, co-worker, or family member who was infected, and at least half will know someone who died.


14. In the short term, Trump will continue to agitate between desperately wanting this to be over quickly and sullenly yielding to the medical professionals who tell him that it will not be.  Expect more of “it would have been a lot worse if I weren’t so amazing” mixed with “we can’t let the cure be worse than the disease” + “hydroxychloroquine will be the miracle drug that saves us all”.  Trump of course will not take action to open things up prematurely – he will instead pass that off to the governors (why take responsibility, ever?).

15. By the end of May, preliminary results from randomized control trials of hydroxychloroquine will show no improvement over standard of care on most measures. Nonetheless, a conservative blog will misleadingly write up this study by saying something like “70% of patients taking hydroxychloroquine showed improvement!” (not mentioning the equal success of standard of care alone).  Trump will happily retweet this.  For him, hydroxychloroquine is a win/neutral play.  Either something comes out of it that he can spin into a giant “I told you so and the media was wrong” or nothing comes out of it and he ignores it and Fox News pretends it never happened.

16. Trump will float the idea of having a big Fourth of July parade / party / celebration (of him) again this year.  It won’t end up happening.

17. The CDC will not relax its guidance saying that gatherings of large crowds are to be avoided by the end of August.  The political conventions, if they take place, will lack all energy and will look very different (normal people will not care).

18. If the conventions happen, expect them to prominently feature doctors, nurses, and EMTs.  I actually think the Trump campaign could be very effective here.

19. There will be no rallies, town halls, or normal campaigning.  What’s happening in the primary race right now (and for campaigns across the country) will not stop because the virus will not stop.

20. Trump and Biden will not shake hands at the debates.

21. Trump will be very upset that the virus is depriving him of that which he’s been looking forward to most as president: Running for re-election and speaking before large crowds.  Don’t think this will be to his detriment however.  His narcissism will not let him avoid the spotlight.  He will call into every TV show every day if he has to get his fix.  He will be better at this than Joe.

22. Don’t make the mistake of assuming that Trump will be blamed for the poor economy – he won’t be.  If he takes on water, it will be because people discover how poorly his administration handled the initial outbreak – or future missteps.

23. It’s possible (but not yet probable) that a member of Congress or candidate for Congressional office dies of COVID.

24. Coronavirus will have both an enormous and a small political impact.  It’s all everyone will talk about during the campaign but at the end of the day it will only act to bring into relief all the pre-existing macro political trends.  City-dwellers rich and poor will hate Trump for messing up the initial response.  Suburban voters will find it another example of why they don’t like Trump.  Rural voters will wonder whether it had to be as big of a deal as it was.  White non-college voters who lose their jobs are an interesting group though I suspect they’ll continue trending Trumpward regardless.

25. Coronavirus will definitely, however, make the race more a referendum on Trump than a contrast election (though it was always going to be that way).  In fact, the small impact it does have in moving the electorate (juicing suburban swing?) could be enough that Joe simply wins a landslide.

How all of this is wrong

The bullish case almost certainly involves the emergence of a truly effective treatment that shortens hospitalizations and dramatically reduces the death rate.  Rapid testing could be deployed such that all health care workers and visitors to nursing homes / prisons / hospitals are tested before entry in order to protect the vulnerable even if the virus continues to spread slowly elsewhere.  There are others that have written big plans about re-opening the economy pre-vaccine and so forth that we can look to, and they all rely on the technological or pharmacological dei ex machina that could certainly happen.  But they haven’t happened yet.


Beware the Nevada Market

Since I’ve committed myself to pricing out each state this primary season, I have about six days left to figure out what I think is going to happen in Nevada.  And yet I do not know what is going to happen in Nevada.  And I particularly do not think the market fully understands itself either.  Here are some things that make me uneasy:

No polls

We’ve had, like, one.  I don’t know how many more we’ll get or how high-quality they’ll be.  The state is apparently quite expensive to poll, and with South Carolina and Super Tuesday around the corner, some pollsters may be keeping their powder dry.  The overall situation leaves the political bettor trying to impute Nevada trends from national polls or other random crappy state polls and it just feels thin.

There’s a debate

Even if we had polls, there’s a debate three days ahead of time.  A ton of New Hampshire voters decided late, and broke for Amy and Pete after the debate.  I’m not sure there’s a similar dynamic about to play out in Nevada, but I’m not really sure there won’t be either.  And I am really skeptical we’ll get a last minute poll measuring any such debate movement (as we did for New Hampshire).

Bernie v Joe?

I was sold on a Bernie v Joe fight in Iowa and didn’t really get one.  Is this the time?  Or is Pete going to out-organize and out-consolidate the rural white vote, resulting in Bernie coasting due to a moderate split?  Does Bernie win the Hispanic vote and, if so, by how much?  Does Warren play spoiler at all in any precincts, or does she miss threshold (to Bernie’s benefit)?

It’s a caucus

They’re using a Google Forms system this time!  Which should be better, except that maybe it won’t be because some people don’t know how to use an iPad.  That said,  I’m skeptical we get another true debacle but obviously the possibility cannot be foreclosed on.  Remember precinct leaders, for tiebreaks we don’t flip coins, we draw cards!

So Bernie’s worth 80c then?

I cannot imagine paying 80c for anyone to win six days out given the uncertainty at play here.  That said, someone has to do it, and I can’t really argue that Bernie isn’t the favorite.  Although, when you really think about it, what is winning?  How do we even define a win?  Who won Iowa again?  Which leads me to my final point:

Remember the first rule of PredictIt

Read the rules, kids.

New Hampshire and Betting Strategies

Live free or die, traders.

It’s probably Bernie

Unlike Iowa, where multiple candidates all had plausible paths to victory, in New Hampshire things seem relatively straightforward.  Bernie Sanders will win.  Unless there’s a polling error underestimating the turnout of moderate voters, who coalesce behind Pete to put him over the top, or may Amy Klobuchar once in a blue moon.  I don’t see tremendous herding in the polls (maybe some), and I don’t really respect a 5-point polling error as much as the models and markets seem to (but that may just be me).  So here are my prices alongside 538’s and PI’s:

NH prices.png
538/PI prices as of 2/10 at 9:00am on 2/10.  For 538 I’m using probability of winning the highest share of the vote (unlike highest number of pledged delegates which I used for Iowa) in order to match what the PI market is measuring.

You can see above that I’ve gone ahead and resolved the Iowa market for Pete for my scorekeeping (which I can fix), because it seems likely that barring a full recount (as of this writing, unclear this is going to happen), Pete will take the edge in both SDE (what the PI market resolves on) and pledged delegates (what 538 was forecasting).  So who’s doing the best so far?

538 > jipkin > PI (so far*)

(*assuming Pete holds the pledged delegate/SDE lead in Iowa.)

Yep, shorting Bernie ended up (pending) being the correct play!

post IA jipkin538PI kellys and briers.png

Both 538 and I agreed that Bernie was overpriced, but I was a bit more bullish.  So on average, 538’s Brier score is a little bit lower than mine.  Because Briers are squared error, they tend to compress differences between forecasts.  You can see the difference in performance a bit more clearly by looking at how much money you’d have made playing a buy-and-hold strategy based on applying the Kelly criterion using the odds provided by either me or 538.  Using this strategy, you would have bet YES on Joe, Pete, and Liz and NO on Bernie (no matter which forecast you chose), and net come out ahead either $917.17 or $723.56.  (And again, the Iowa market is not done yet – this could all flip if Bernie requests and wins a recount).

Other Betting Strategies…

Of course, playing a pure Kelly-based buy-and-hold strategy on 538’s forecast or my own isn’t the only way to go.  For fun, I’ve added a bunch of other betting strategies to compare (all assume you have $850 to spend per contract):

Betting strategy overview post iowa.png

Bet the favorite – Just buy whatever’s over 50c on the theory that PI’s prices are always too cautious.  If Bernie is 67c, you pay 67c.  If Joe is 25c, you pay 75c for some NO.  Did not, uh, work out too well in Iowa.

“Safe Money” – What happens if you only bet on things the market says are probably not going to happen, i.e. priced at 90c or more?  You end up losing a lot of money when Pete wins Iowa is what happens.

BLIND DEGEN – What happens if you max literally anything under 10c regardless of its true value?  Congrats on the $6k my friend.

“VALUE” DEGEN – What if you’re into degenerate bets, but temper them by only placing them on sub-10c contracts which 538 says are worth at least 6c more?  Easy $7.7k, shipped.

“The Market is Wrong” – Ah, you’re a contrarian I see.  Well fine, go ahead and literally take the opposite side of the market in every contract.  If Bernie is 67c, we pay 33c for NO.  If Joe is 25c, we buy Joe at 25c.  And if Pete is 9c, we buy Pete at 9c and hey would you look at that there’s $6.7k.

Inverse the Live Ones – A sophisticated contrarian!  For the trader who’s too wise to bet against everything, because they don’t want to lose money betting on trash.  This trader only bets against any contract that’s between 10c and 90c.  And does ok here, losing a max on Joe but winning more than that on Bernie NO.

jipkin’s Actual Profit – Finally, there’s the money I actually made playing the way I play.  Which, honestly, happens to be my second-best PI market of all time on account of all the craziness that’s happened (currently up $4.3k).  I’m curious to see how well I do versus these other strategies throughout the entire primary season!

What I did right and wrong in IA

Even though I made quite a bit of money in the Iowa market ($3k over the week), I’m more or less feeling “meh” about it.  There were numerous spots where I made subpar decisions, and I think realistically I should be up around $12k right now if I’d been sharper.  Here’s a non-exhaustive list of what I think I did relatively poorly:

  • I was correct about when the site would lag and trading would become unbearably slow, and playing lightly throughout the day on Monday was correct.  However, this meant I was basically sitting doing nothing all night when the results never came…
  • I should have bet against Joe Biden hard on the first satellite results. Even though at the time I knew they didn’t mean that much, I also knew they would move the markets, which they did.
  • I didn’t have the fastest results site for the initial few results we did have, and thus missed the beginnings of the Pete boom.  There was still time to hop aboard and I should have done so at least gingerly.
  • Later in the night, when it was coming down to competing data from Pete and Bernie, it was clear that Pete probably had the edge in SDE and while I did bet his numbers, I should have bet harder (the site performance didn’t help).
  • I should have gone into the first results update holding more than like 80 shares of Pete.  Probably should have held 2k shares total at least… it was clearly mispriced at that point (70/30 for Bernie) but unfortunately my loss aversion was too strong to make the obvious bet.  I reasoned that with how much volume was in the order book I’d be able to scoop shares, so there was little point in scooping.  However, this did not account for site performance basically preventing that play and for the fact that the IDP site didn’t update in my region for like 10 minutes after where it updated in other regions.
  • On the false IDP update where they gave delegates to Patrick and Steyer instead of Bernie and Warren, I should have noticed this long before anyone on twitter did.  I made money on this flip, but it should have been insanely more (could have been nearly doubled at 6-7c instead of 2k 8c shares).
  • I did catch the Bernie 30->75c flip, but should have bought for the bounce back to 60c and then back up to 75c.  Was not trading in the zone here!
  • Later in the week, I should have been flipping a bit more at a time and setting more aggressive price targets.

If you’d like to review my trades, you can find them here!  Here’s what the realized profit over time looks like:

IA cumulative profit over time.png

The Usual Disclaimer

Please do your own research before trading.  I obviously have positions in the markets I discuss here and you should therefore consider everything I write to be fundamentally biased.  None of what I write should be considered specific financial advice.

Jumbled Thoughts on a Jumbled Mess

At some point on Thursday night, February 6th, I was actually so exhausted from trading all week that for the first time in my life I simply did not want to trade anymore.  It was that kind of a week.  My brain is still an addled mess, as, it seems, the race for the Democratic nomination is as well.  I’ll give my thoughts on NH specifically tomorrow, but for now here is what I can make of the bigger picture:

(Note of caution: literally every take I wrote in this post from November turned out to be wrong, so reader beware).


PI state of the race.png
PredictIt’s handy map of market prices.

Say it with me folks!  President.  Bernie.  Sanders.

Okay, I don’t really buy this map.  I mean, he’s not sweeping the entire south, right?  (And we’ll get to Bloomberg in a bit).  But here we are.

The Bernie Domino Theory was that Bernie would win IA, NH, and NV, putting so much pressure on the others that he might even win SC or (at a minimum) sweep the big prizes on Super Tuesday.  I never put much stock in it, but I missed the scenario where Bernie winning the first three also coincided with Joe Biden failing miserably, and that is seeming more and more like the world in which we live.  If it’s a weak Joe, Bloomberg, Pete, and Klobuchar sticking around through Super Tuesday and mucking up the moderate lane, I don’t see how Bernie doesn’t cruise this thing (individual states will be another matter).

Joe Needs Nevada

At least he needs second place there.  That said, I do wonder how much fun would be had in the markets if he goes 4th -> 4th -> 4th -> and then 1st in South Carolina.  But if he doesn’t show any sign of strength by Super Tuesday, he will be gone shortly thereafter.  Speaking of which,

Where will Joe’s support among black Americans migrate if he collapses?

Young black voters have shown plenty willingness to support Bernie or Liz, but the older, more moderate core of black voters have simply not abandoned Joe at all.  The biggest question hanging over the race right now is whether or not this support starts to migrate elsewhere, and to whom if so.  The betting markets right now seem to thing Mayor Stop-n-Frisk will get his fair chunk (and I don’t entirely disagree, he did win substantial black support in his mayoral runs, and these voters may argue to themselves that he’s the best shot at beating Trump).  But I wouldn’t be surprised if a meaningful slice didn’t go to Bernie as well.

Liz running out of room

What would happen if Bernie and Pete tied for first in Iowa while Liz and Joe were a distant third and fourth?  Here’s what I wrote in December:

If A/B is Bernie/Pete and C/D is Joe/Liz – wew.  Liz is very close to done for and Joe is in trouble but not out of it – turbulent waters for him in Southern state markets.  Bernie would see 40c in the overall market in this world.

And yeah.  She’s just boxed in here.  Bernie is too strong for her to eat into his supporters.  Pete is too strong for the moderate educated whites to give her enough support in NH (she’s also losing some of that “wine track” support to Klobuchar now).  She may come out of NH and IA with “respectable thirds” which don’t matter for much.  Maybe she even organizes to another third in NV.  But at some point the money is going to dry up and you have to ask whether it’s worth the embarrassment of losing your home state on Super Tuesday, or whether you simply drop and endorse Bernie to put him over the top.

For fun, here’s what I wrote about Liz way back in December of 2018:

Her path – Put out lots of policy proposals thinking they’ll matter and drop out on March 11.


Ok.  I mean…. ok.  That’s how I feel about this.  Sure.  Why not.  Obviously, no one knows if you can actually just buy the nomination but he’s certainly giving it a go.  In addition to spamming a bajiillion ads, he’s also been very strategically accumulating superdelegate endorsements from House members whose campaigns he boosted in 2018.  His strategy is basically correct – he’s playing the game as he should given who he is and his resources.

But like, really?  The Democrats just pick the billionaire?  It’s hard for me to see him as doing anything other than winning like a handful of delegates on Super Tuesday, “suspending” his campaign so he doesn’t have to reveal his financials, then play kingmaker or hope to win a contested convention.


Is not dead, apparently:

He hasn’t and will not catch on with black voters.  He’s fading nationally (about to be surpassed by Bloomberg).  He retains some equity in IA, but not enough.  To me it feels like the tide is turning against the former mayor of South Bend, Indiana.  That said, he and Joe have the run of Iowa with two weeks to go due to the impeachment trial.  Maybe he pulls something out, but I’ve started to convince myself that the undecideds simply aren’t going to break his way.  I think he drops after NH, and now that I’ve committed to that publicly, he’ll probably go on to win the nomination or something.

Welp, I guess he’s going to go on to win the nomination now!  (Seriously though, what on earth is his path beyond hoping for a contested convention?  Would be quite hilarious if he wins NH, I suppose).

Game Time

Welp, it’s today!

Last week, I promised one final update on 538 vs PI, along with my own prices.  Here they are:

Iowa Final Prices.png
All numbers based on odds to win the most delegates, per PI rules.  “Final” means as of 11:59 on 2/2.  The PI lines will continue to move throughout the day – current up to 71c for Bernie! (Will share the full spreadsheet later and make it prettier over time)

As you can see, I come down closer to 538 than PI.  I think PI is simply showing far too little respect for uncertainty in both polling and the inherently wonky nature of the caucuses themselves.  [That said, you’ll note I’ve conveniently priced myself between the two (just how it worked out, I promise) so I’m guaranteed to beat either 538 or PI.]

Betting Strategies

In addition to accuracy measures, one of the things I’ll be doing with these predictions is comparing how various betting strategies using them work out.  More on this in the aftermath, but basically I’ll be comparing using my numbers or 538’s to bet on PI, simply betting the favorite, always betting the underdog, going full degenerate, etc.  These will all be “buy and hold” strategies based on getting in at the last moment, and I’ll compare their results to my own actual play (decidedly NOT buy-and-hold).

My plan

Speaking of which, what is my betting strategy?  Well, as you’ve learned or will learn reading this blog, I’m relatively bad at predicting the future.  But I’m relatively good at reacting to the present.  So my strategy revolves around watching the results and making moves as I feel appropriate.  Some elections flip-flop around (the fun ones), some break hard early and never look back, etc.  Here’s what I’ll be looking for:

Who is winning in early rural county precincts?  This is where a Pete/Amy surprise would show up first – and a Joe loss.

Is Bernie winning the urban counties as expected, or is Warren close or leading?  This is where a potential Warren upset would start to brew.

Who is winning the satellite caucuses?  If it’s Joe by a mile and he’s in a tight race with Bernie for the precinct caucuses… then it’s Joe’s night.

How much is Bernie crushing the first alignment?  This is the “popular vote” and I think a valuable source of information on how much enthusiasm Bernie can really inspire from his supporters this cycle.

When will PredictIt crash? Be forewarned – the instant a big flip starts to happen (let’s say it looks like Joe is going to win), everyone will try to trade all at once not only in the Iowa market, but in 50-odd markets site-wide.  The website has never survived this in the past, and therefore I have to be pessimistic we’re fully operational throughout the duration tonight.  This is also why I plan to be holding virtually no sizable positions from 8:00pm eastern on in any of the live contracts.

Will there be money in the MoV market?  This is usually where the big money on election nights comes from.  There are some other potential sources tonight too – if Joe wins IA for instance, Bloomberg will die site-wide (and Bloomberg moons if Joe is a weak fourth).

Good luck!