An Interval Measure of Election Poll Accuracy

 

By

Joseph Shipman, PhD

Director of Election Polling

SurveyUSA

 

And

Jay H. Leve

Editor

SurveyUSA

 

Abstract: Measures of election poll accuracy, starting with Mosteller’s pioneering work in 1949, have always involved “point estimates” of error, using a single set of predicted vote percentages. A known drawback of any such measure is that scholars must make an assumption about how to allocate “undecided” voters, or choose to ignore the undecided voters. A new measure is proposed, which uses an “interval estimate” to account for every possible allocation of undecided voters, rather than choosing to ignore the undecided voters or allocate them arbitrarily. Advantages and disadvantages of this measure are discussed.

 

Contents

 

1.        Six Pollsters In Search Of An Arbiter

2.        The “Undecided” Issue

3.        The Work Of Mosteller

4.        Comparison Of 6 Mosteller Measures

5.        Mapping Each Measure

6.        Relative Accuracy Vs Absolute Accuracy

7.        Traugott’s Measure

8.        A New Concept: Interval Estimation

9.        A New Measure: Median Spread Error

10.     Advantages And Disadvantages

11.     The “Undecided” Issue Revisited

12.     Conclusion

 


1.       Six Pollsters In Search Of An Arbiter

 

Consider an election between two candidates, Smith and Jones, with no write-ins allowed. The outcome of this election is a pair of numbers which sum to 100%, for example:

 

 

This is about as simple as an election can get.

 

Now suppose that the day before the election, six different pollsters release polls, all with the same field period.

 

The 6 polls are as follows:

 

 

Smith

Jones

Undecided

Pollster # 1

60

32

8

Pollster # 2

54

36

10

Pollster # 3

61

36

3

Pollster # 4

54

40

6

Pollster # 5

56

36

8

Pollster # 6

57

43

0

 

 

 

 

Actual Vote

60

40

n/a

 

The following graph plots each pollster’s projection, with Smith on the X axis and Jones on the Y axis, and the actual outcome (Smith 60, Jones 40) at the center-point of the graph.



Which poll was the most accurate?

 

The point of this paper is: this seemingly simple question is remarkably difficult to answer.

 

Let's ask the pollsters who did the best. Here's what they say.

 

Pollster #1: “My poll was the most accurate. I was the only one to get the winner's vote total exactly right.”

 

Pollster #2: “My poll was the most accurate. Smith won by a 3:2 margin, and only my poll got that ratio exactly right. If my undecideds are taken out, the rest of my sample voted 60 to 40 for Smith, a bullseye.”

 

Pollster #3: “Well, you didn't take out undecideds, did you? My poll was the most accurate: I was off by 1 point for Smith, and by 4 points for Jones, so my average error was 2.5 points per candidate, which was better than everyone else.”

 

Pollster #4: “Wait a minute. You shouldn't be looking at how many points off you were, you should look at percentage error. I underestimated Smith by 10% of his actual vote (he got 60% and I said 54%), and I was exactly right on Jones, so my average error was 5% of each candidate's vote, which was better than everyone else. My poll was the most accurate.”

 

Pollster #5: “Nobody cares about the individual candidate vote predictions, they care about the margin of victory. Smith won by 20 points, and I was the only one who said he'd win by 20, so my poll was the most accurate.”

 

Pollster #6: “You didn't really say he'd win by 20, because you didn't say what 8% of the voters were going to do – it’s only a 20-point margin if your undecideds happen to split 50-50. That's not a real prediction. I made a real prediction, and I was within 3 points for both candidates, and all the rest of you were further away for one or both candidates. My poll was the most accurate.”

 

Hmmm.

 

2.       The “Undecided” Issue

 

A big part of the difficulty is accounting for “undecided” voters. If all the pollsters had reported predictions for Smith and Jones that summed to 100%, the comparisons would be easier, because all the points in the graph would fall on a single line rather than being spread in two dimensions. But allocating undecided voters is not a mathematical exercise; rather, it is subject to caprice and whim. With the benefit of hindsight, Pollster 2 self-servingly recommends “proportional allocation,” awarding the undecideds to the candidates in proportion to their actual votes, while Pollster 5 self-servingly prefers “equal allocation,” where each candidate gets half the “undecided” voters. But neither said so before the votes were counted.

 

This is not to say there is no such thing as an undecided voter; nor that predicting some elections may involve greater uncertainty than others because voter preferences are less established. But for the purposes of evaluating the accuracy of polls, we must compare the polls with the actual election outcomes, where there are no “undecideds.”

 

It is possible to take the attitude of Pollster #6, that vote predictions should sum to 100%. If the poll detects a high degree of indecision among the potential voters, this can be reported separately, in the same way a “Margin of Error” is reported as a separate index of the reliability of a prediction. But as long as the usual practice is to report “undecideds” as a subgroup of the electorate of a particular size, measures of poll accuracy must deal with this.

 

3.       The Work Of Mosteller

 

Following the 1948 presidential election, a commission was formed to study the failure of polls to predict Truman's reelection. The resulting book, The Pre-election Polls of 1948: Report to the Committee on Analysis of Pre-Election Polls and Forecasts, by Frederick Mosteller et al, was published by the Social Science Research Council (New York, 1949). In this book, eight different ways to measure the accuracy of a pre-election poll were proposed (for short-hand herein and going forward: “Mosteller 1” through “Mosteller 8”).[1]

 

Though it has been more than 50 years since Mosteller proposed his measures, they remain today the “default” method by which pre-election polls are evaluated.

 

The six Mosteller measures which depend on predicted vote percentages are all “error measures,” where a smaller score indicates a smaller error and, as such, a more accurate poll. The six measures are defined as follows:

 

Mosteller 1: The difference in percentage points between the winner's predicted and actual proportion of the total votes cast.

 

Mosteller 2: The difference in percentage points between the winner's predicted and actual proportions of the votes received by the top two candidates.

 

Mosteller 3: The average deviation in percentage points between predicted and actual returns for each candidate (without regard to sign).

 

Mosteller 4: The average percentage error (averaging the deviations from 100 percent of the ratio of predicted to actual proportion).

 

Mosteller 5: The (unsigned) difference of the oriented differences between predicted and actual percentage results for the top two candidates.

 

Mosteller 6: The maximum observed difference between predicted and actual percentage results for any candidate.

 


 

4.       Comparison Of 6 Mosteller Measures

 

Before we try to parse these definitions, it is helpful to see what the six error measures have to say about the polls discussed above. The yellow highlight indicates the lowest error for each measure:

 

Pollster

Smith

Jones

M1 Error

M2 Error

M3 Error

M4 Error

M5 Error

M6 Error

P1

60

32

0.00

5.22

4.00

10.00

8.00

8.00

P2

54

36

6.00

0.00

5.00

10.00

2.00

6.00

P3

61

36

1.00

2.89

2.50

5.83

5.00

4.00

P4

54

40

6.00

2.55

3.00

5.00

6.00

6.00

P5

56

36

4.00

0.87

4.00

8.33

0.00

4.00

P6

57

43

3.00

3.00

3.00

6.25

6.00

3.00

Actual

60

40

0.00

0.00

0.00

0.00

0.00

0.00

 

The following chart ranks each pollster’s performance from best (a ranking of 1) to worst (a ranking of 6), using each measure. The yellow highlight indicates the best pollster for each given measure:

 

Pollster

Smith

Jones

M1 Rank

M2 Rank

M3 Rank

M4 Rank

M5 Rank

M6 Rank

P1

60

32

1

6

4

5

6

6

P2

54

36

5

1

6

5

2

4

P3

61

36

2

4

1

2

3

2

P4

54

40

5

3

2

1