An Interval Measure of
Election Poll Accuracy
By
Joseph
Shipman, PhD
Director
of Election Polling
SurveyUSA
And
Jay
H. Leve
Editor
SurveyUSA
Abstract: Measures of election poll accuracy, starting with
Mosteller’s pioneering work in 1949, have always involved “point estimates” of
error, using a single set of predicted vote percentages. A known drawback of any such measure is that scholars must make an assumption about how to
allocate “undecided” voters, or choose to ignore the
undecided voters. A new measure is proposed, which uses an “interval
estimate” to account for every possible allocation of undecided voters, rather
than choosing to ignore the undecided voters or allocate them arbitrarily. Advantages and disadvantages of
this measure are discussed.
Contents
1.
Six Pollsters In
Search Of An Arbiter
2.
The “Undecided”
Issue
3.
The Work Of Mosteller
4.
Comparison Of 6 Mosteller
Measures
5.
Mapping Each
Measure
6.
Relative Accuracy
Vs Absolute Accuracy
7.
Traugott’s Measure
8.
A New Concept:
Interval Estimation
9.
A New Measure: Median
Spread Error
10.
Advantages And
Disadvantages
11.
The “Undecided”
Issue Revisited
12.
Conclusion
1.
Six Pollsters
In Search Of An Arbiter
Consider an election between
two candidates, Smith and Jones, with no write-ins allowed. The outcome of this
election is a pair of numbers which sum to 100%, for example:
This is about as simple as
an election can get.
Now suppose that the day
before the election, six different pollsters release polls, all with the same
field period.
The 6 polls are as follows:
|
|
Smith |
Jones |
Undecided |
|
Pollster # 1 |
60 |
32 |
8 |
|
Pollster # 2 |
54 |
36 |
10 |
|
Pollster # 3 |
61 |
36 |
3 |
|
Pollster # 4 |
54 |
40 |
6 |
|
Pollster # 5 |
56 |
36 |
8 |
|
Pollster # 6 |
57 |
43 |
0 |
|
|
|
|
|
|
Actual Vote |
60 |
40 |
n/a |
The following graph plots
each pollster’s projection, with Smith on the X axis and Jones on the Y axis,
and the actual outcome (Smith 60, Jones 40) at the center-point of the graph.

Which poll was the most
accurate?
The point of this paper is:
this seemingly simple question is remarkably difficult to answer.
Let's ask the pollsters who
did the best. Here's what they say.
Pollster #1: “My poll was the most accurate. I was the only one
to get the winner's vote total exactly right.”
Pollster #2: “My poll was the most accurate. Smith won by a 3:2
margin, and only my poll got that ratio exactly right. If my undecideds are
taken out, the rest of my sample voted 60 to 40 for Smith, a bullseye.”
Pollster #3: “Well, you didn't take out undecideds, did you? My
poll was the most accurate: I was off by 1 point for Smith, and by 4 points for
Jones, so my average error was 2.5 points per candidate, which was better than
everyone else.”
Pollster #4: “Wait a minute. You shouldn't be looking at how many
points off you were, you should look at percentage error. I underestimated Smith
by 10% of his actual vote (he got 60% and I said 54%), and I was exactly right
on Jones, so my average error was 5% of each candidate's vote, which was better
than everyone else. My poll was the most accurate.”
Pollster #5: “Nobody cares about the individual candidate vote
predictions, they care about the margin of victory. Smith won by 20 points, and
I was the only one who said he'd win by 20, so my poll was the most accurate.”
Pollster #6: “You didn't really say he'd win by 20, because you
didn't say what 8% of the voters were going to do – it’s only a 20-point margin
if your undecideds happen to split 50-50. That's not a real prediction. I made
a real prediction, and I was within 3 points for both candidates,
and all the rest of you were further away for one or both candidates. My poll
was the most accurate.”
Hmmm.
2.
The
“Undecided” Issue
A big part of the difficulty
is accounting for “undecided” voters. If all the pollsters had reported
predictions for Smith and Jones that summed to 100%, the comparisons would be
easier, because all the points in the graph would fall on a single line rather
than being spread in two dimensions. But allocating undecided voters is not a
mathematical exercise; rather, it is subject to caprice and whim. With the benefit
of hindsight, Pollster 2 self-servingly recommends “proportional allocation,” awarding
the undecideds to the candidates in proportion to their actual votes, while
Pollster 5 self-servingly prefers “equal allocation,” where each candidate gets
half the “undecided” voters. But neither said so before the votes were counted.
This is not to say there is
no such thing as an undecided voter; nor that predicting some elections may
involve greater uncertainty than others because voter preferences are less established.
But for the purposes of evaluating the accuracy of polls, we must compare the
polls with the actual election outcomes, where there are no “undecideds.”
It is possible to take the
attitude of Pollster #6, that vote predictions should sum to 100%. If the poll
detects a high degree of indecision among the potential voters, this can be
reported separately, in the same way a “Margin of Error” is reported as a
separate index of the reliability of a prediction. But as long as the usual
practice is to report “undecideds” as a subgroup of the electorate of a
particular size, measures of poll accuracy must deal with this.
3.
The Work
Of Mosteller
Following the 1948
presidential election, a commission was formed to study the failure of polls to
predict Truman's reelection. The resulting book, The Pre-election Polls of
1948: Report to the Committee on Analysis of Pre-Election Polls and Forecasts,
by Frederick Mosteller et al, was published by the Social Science Research
Council (New York, 1949). In this book, eight different ways to measure the
accuracy of a pre-election poll were proposed (for short-hand herein and going
forward: “Mosteller 1” through “Mosteller 8”).[1]
Though it has been more
than 50 years since Mosteller proposed his measures, they remain today the “default”
method by which pre-election polls are evaluated.
The six Mosteller measures
which depend on predicted vote percentages are all “error measures,” where a
smaller score indicates a smaller error and, as such, a more accurate poll. The
six measures are defined as follows:
Mosteller 1: The difference in percentage points between the
winner's predicted and actual proportion of the total votes cast.
Mosteller 2: The difference in percentage points between the
winner's predicted and actual proportions of the votes received by the top two
candidates.
Mosteller 3: The average deviation in percentage points between
predicted and actual returns for each candidate (without regard to sign).
Mosteller 4: The average percentage error (averaging the deviations
from 100 percent of the ratio of predicted to actual proportion).
Mosteller 5: The (unsigned) difference of the oriented
differences between predicted and actual percentage results for the top two
candidates.
Mosteller 6: The maximum observed difference between predicted
and actual percentage results for any candidate.
4.
Comparison
Of 6 Mosteller Measures
Before we try to parse
these definitions, it is helpful to see what the six error measures have to say
about the polls discussed above. The yellow highlight indicates the lowest
error for each measure:
|
Pollster |
Smith |
Jones |
M1 Error |
M2 Error |
M3 Error |
M4 Error |
M5 Error |
M6 Error |
|
P1 |
60 |
32 |
0.00 |
5.22 |
4.00 |
10.00 |
8.00 |
8.00 |
|
P2 |
54 |
36 |
6.00 |
0.00 |
5.00 |
10.00 |
2.00 |
6.00 |
|
P3 |
61 |
36 |
1.00 |
2.89 |
2.50 |
5.83 |
5.00 |
4.00 |
|
P4 |
54 |
40 |
6.00 |
2.55 |
3.00 |
5.00 |
6.00 |
6.00 |
|
P5 |
56 |
36 |
4.00 |
0.87 |
4.00 |
8.33 |
0.00 |
4.00 |
|
P6 |
57 |
43 |
3.00 |
3.00 |
3.00 |
6.25 |
6.00 |
3.00 |
|
Actual |
60 |
40 |
0.00 |
0.00 |
0.00 |
0.00 |
0.00 |
0.00 |
The following chart ranks
each pollster’s performance from best (a ranking of 1) to worst (a ranking of
6), using each measure. The yellow highlight indicates the best pollster for
each given measure:
|
Pollster |
Smith |
Jones |
M1 Rank |
M2 Rank |
M3 Rank |
M4 Rank |
M5 Rank |
M6 Rank |
|
P1 |
60 |
32 |
1 |
6 |
4 |
5 |
6 |
6 |
|
P2 |
54 |
36 |
5 |
1 |
6 |
5 |
2 |
4 |
|
P3 |
61 |
36 |
2 |
4 |
1 |
2 |
3 |
2 |
|
P4 |
54 |
40 |
5 |
3 |
2 |
1 |