Election pollsters put their methods to the test – and turnout is the key

Let’s find out. EPA/Andy Rain

Trust in election forecasting is probably as low as it has been since 1948, when political polling suffered possibly its worst ever humiliation. In that year’s US presidential election, the 8-1 underdog Harry Truman defied all predictions to defeat his Republican challenger, Thomas Dewey. “Dewey Defeats Truman” screamed the now-infamous newspaper headline printed before the actual votes were counted.

The methods used to forecast elections have multiplied since then – and grown more sophisticated. Besides opinion polls, forecasters use statistical models, some of which weight polls by past performance and which might allow for such things as late tactical vote switching or a late vote swing to the incumbent candidate or party. Increasingly popular also in the past 20 years or so has been the use of betting markets and prediction markets to capture the state of the race. So too is the idea of combining forecasts derived from different forecasting methodologies to produce one aggregated forecast.

Until 2015, the conventional wisdom was that all these techniques were – leaving aside rare exceptions, such as the UK’s 1992 general election – part of a golden age of election forecasting. The question wasn’t whether polls, prediction and betting markets, and forecasting models were good – but which was the best of a good bunch.

In attempting to answer this question, I was fortunate enough to gain access to data on every trade made in the two leading betting/prediction exchanges in the world over hundreds of separate state and national elections spanning a period of almost a decade.

By applying state-of-the-art econometric analysis to this data, my co-author James Reade and I were able to establish that the betting/prediction markets outperformed or matched other forecasting methodologies on key metrics. Still, polls turned out to be quite good as well, especially when known biases were eliminated from the raw published figures.

Missing the mark

Then there came the polling debacle that was the UK’s 2015 general election. From the dissolution of parliament on March 30 right up until the very eve of the election, held on May 7, scarcely any polls had the Conservative and Labour parties separated by more than three or four percentage points. The great majority had the parties within a point or two of each other, and in no consistent direction. The prospect of a hung parliament appeared very strong.

Sophisticated forecasting models based on the polls concurred, as did the betting markets – but in the end, the Conservatives came out more than six points ahead of Labour.

The poll that shocked a nation: election night, 2015. David Holt via Flickr, CC BY-SA

Numerous theories emerged trying to explain the reason for the discrepancy between the results and polls, with some focus on who actually turned out to vote. The conventional wisdom today is that those who were polled were not representative – notably the Labour-supporting younger people captured in the polls were far more likely to vote than younger Labour-supporting people in general.

As a result, turnout among Labour supporters was overestimated relative to that among Conservative supporters. “Lazy young Labour” voters, it seems, left the election in the hands of “motivated old Tories”.

Showing up

This issue of unexpected levels of turnout by certain demographics has since been used to explain the outcome of both the Brexit vote and the election of Donald Trump. In both cases, pollsters called it wrong by underestimating turnout among various groups – in particular less-educated voters, who usually turn out in relatively low numbers but who proved decisive in terms of Brexit and Trump support. This surge of votes by those who usually don’t vote was unexpected – and ultimately decisive.

So what about the present election? Have the lessons been learned? How much confidence can we have in the current forecasts?

Well turned-out. PA/Chris Radburn

This time, there’s noticeably less consensus among the various polls, in large part because different pollsters are using very different models to predict turnout. Depending on the assumptions pollsters apply, raw data can yield very different results.

Some pollsters will be adjusting stated voting intentions to align with the actual turnout seen among various demographics in 2015. Others will be reluctant to assume that relative turnout is likely to hold so steady – after all, the turnout gap between voters under 25 and those over 65 was only 23% at the 2010 election, but widened to 35% in 2015.

If pollsters could know how many in each age group will vote, they would be much more confident of forming an accurate estimate of the state of the race. The pollsters who assume that younger voters will turn out in relatively greater force than in 2015 are more likely to project a good vote share for Labour relative to the Conservatives, and vice-versa.

On June 9, we will have a much better idea of these turnout figures – but we should also have the full results, so they won’t matter. Except, of course, for pollsters hoping to improve their methods for next time.