Mark Blumenthal, the “Mystery Pollster,” writing at Pollster.com, implies several things about SurveyUSA’s Pollster Report Cards that I will address here:
Blumenthal wonders if SurveyUSA uses in its 2008 Pollster Report Cards a measure of poll error that may be cherry-picked to favor SurveyUSA. Blumenthal wonders if SurveyUSA results might be less accurate, compared to other pollsters, if SurveyUSA had used a different measure of poll error.
Here is the context:
Beginning in 1992, SurveyUSA began keeping and publishing side-by-side pollster comparisons. Those are posted prominently on this website.
However: in 2003, SurveyUSA began publishing pollster scorecards that kept track of SurveyUSA election poll performance using every known measure of election poll accuracy.
Shortly after the 2002 Mid-Term Election, SurveyUSA posted to its website an interactive tool that examined every one of SurveyUSA final election poll forecasts against every polling competitor, using every known measure of election poll accuracy. Here is what SurveyUSA wrote when the tool was created and first published in 2003:
SurveyUSA did not want to create possible controversy (or apparent bias) by choosing just one way to measure its performance. SurveyUSA was concerned that others might say SurveyUSA had chosen a method of measurement that showcased SurveyUSA results in a favorable light, while deliberately avoiding a different measure that might have showcased SurveyUSA’s results in a less-favorable light.As a result, SurveyUSA created this Interactive Election Scorecard, which includes all possible ways to measure the precision of competing pollsters. The Scorecard lets you look at SurveyUSA’s performance using the measure (or measures) of your choice.SurveyUSA does better using some measures, worse using other measures. The good and the bad is all here, for your inspection and review.
This “Interactive Election Scorecard” (IES) that SurveyUSA created has been “live” on SurveyUSA’s website for 5 years. There have been no days when an observer could not have seen how SurveyUSA’s election polls stack-up against all others, by any measure.
The IES tool is indeed interactive, a Microsoft Excel Pivot Table, and allows you to custom-create criteria by which you can evaluate polling firms. The only way to experience the interactivity is to open the tool. But, for those who do not have the time, willingness or know-how to open an Excel workbook across the Internet, the following still-images illustrate what you can see with the tool open (what you actually will see depends on what criteria you select; the display is customized for every user). Since there are 8 separate measures of pollster accuracy, 8 images follow, one for each measure. In order, the still images are:
- Mosteller 1: Mean Error and Standard Deviation
- Mosteller 2: Mean Error and Standard Deviation
- Mosteller 3: Mean Error and Standard Deviation
- Mosteller 4: Mean Error and Standard Deviation
- Mosteller 5: Mean Error and Standard Deviation
- Mosteller 6: Mean Error and Standard Deviation
- Traugott: Mean Error and Standard Deviation
- Shipman: Mean Error and Standard Deviation
The tool includes a definition for each measure, and background on Frederick Mosteller.
Even when these still images are reduced to thumbnail size, blue colors and red colors can be seen. Where you see the color blue, SurveyUSA has, since its inception, and on average, produced more accurate final election polls than has a particular polling competitor. Where you see the color red, the competitor has, on average, produced more accurate final election polls than has SurveyUSA.
As SurveyUSA wrote in 2003:
This Interactive Election Scorecard is not a marketing document. It does not selectively include only SurveyUSA’s best work, nor does it selectively exclude the competition’s best work. Rather, the Scorecard is exhaustive: every SurveyUSA election poll is included, good and bad, since SurveyUSA began polling in 1992, and every known poll conducted by a competing pollster is included. In this way, the Interactive Election Scorecard is a scientific measurement of SurveyUSA’s performance. You can search the Internet: no other pollster, academic or commercial, publishes a Scorecard similar to this.
When you assemble this much data, some remarkable things happen. Here is just one. On the measure Mosteller 6, for example, three firms outperform SurveyUSA.
- Harris Interactive, which uses the Internet to conduct research.
- Polimetrix, which uses the Internet to conduct research.
- And The Columbus Dispatch newspaper, which uses U.S. mail to conduct research.
SurveyUSA outperforms 44 other firms, including every single one of the traditional “headset operator” telephone pollsters, a number of which have worked for 16 years to mock and marginalize the innovative work done by SurveyUSA.
One certain limitation of the IES is that it only examines contests in which SurveyUSA polled. Contests that other firms polled, but not SurveyUSA, are not included in the IES.
To proactively address this, and to produce additional learning about the accuracy of all polling firms in all statewide contests, SurveyUSA in 2004 created an entirely separate database of all pollsters in all statewide contests, including contests that SurveyUSA did not poll.
The result is a separately maintained ledger, and a separately produced interactive Microsoft Excel pivot table, which is the most complete compendium for election year 2004 that exists, anywhere, to my knowledge.
This, too, has been live on SurveyUSA’s website, and available for public inspection, continuously since November 2004.
Here is what the default display looks like (again: your display can be tailored to look however you want, based on the custom criteria you select)(double-click the image to enlarge; click again to further enlarge):
In this 2004 analysis, SurveyUSA includes 4 measures of poll accuracy, not all 8. To get to the tool, you have to be willing to open this Excel workbook. Those who examine the data will see SurveyUSA is not singularly advantaged by the Mosteller 5 measure, nor is SurveyUSA singularly disadvantaged by some other measure.
Now, let us return to Mr. Blumenthal. In his post, he wonders at what point the differences between one pollster and another become statistically significant. He asks his readers for help. None, to date, offer any.
The answer is in part, though not entirely, on the “cover page” of the IES. There, SurveyUSA adds up all of the occasions that SurveyUSA was more accurate than a competitor, adds up all of the occasions that SurveyUSA was less accurate than a competitor, and adds up all of the occasions that SurveyUSA and a competitor produced comparable results. SurveyUSA then calculates a “Winning Percentage” based on these sums. “Winning percentage” is calculated the same way the National Football League does it. One point for a “win.” One-half a point for a “tie.” No points for a “loss.” The number of points are divided by the number of contests.
- If SurveyUSA had competed against Pollster “A” 10 times, and SurveyUSA was better 10 times, comparable zero times and worse zero times, SurveyUSA’s winning percentage would be 1.000. (10 points divided by 10 contests).
- If SurveyUSA had competed against Pollster “B” 10 times, and SurveyUSA was better 5 times, comparable zero times, and worse 5 times, SurveyUSA’s winning percentage would be 0.500. (5 divided by 10).
- If SurveyUSA had competed against Pollster “C” 10 times, and SurveyUSA was better zero times, comparable zero times, and worse 10 times, SurveyUSA’s winning percentage would be 0.000. (Zero divided by 10).
- Obviously, the higher the winning percentage, the better. A score of 0.500 is average. Any score higher than 0.500 is above average. Any score lower than 0.500 is below average.
- The New York Yankees baseball team has an all-time winning percentage of 0.567, after 9,383 wins and 7,162 losses. That’s the best in major league baseball.
- The Dallas Cowboys football team has an all-time winning percentage of 0.578, after 414 wins, 303 losses, and 6 ties. That’s the best in pro football.
- The Boston Celtics basketball team has an all-time winning percentage of 0.587, after 2794 wins and 1963 losses. That’s the best in the NBA.
After 1,559 competitions, here is SurveyUSA’s winning percentage, according to each of the 8 measures:
- Mosteller 1 = 0.726
- Mosteller 2 = 0.589
- Mosteller 3 = 0.774
- Mosteller 4 = 0.795
- Mosteller 5 = 0.577
- Mosteller 6 = 0.767
- Shipman = 0.727
- Traugott = 0.590
The question a scholar should ask is:
- What are the odds that chance alone could account for the winning percentages shown here?
In the case of Mosteller 5, which is the measure by which SurveyUSA does worst, Mark, the odds that chance alone could account for SurveyUSA having 712 wins, 473 losses and 374 ties are …
- 908 million to 1.
For all 7 other measures, the odds are …
- Greater than 1 billion to 1.
Here’s what that IES “cover sheet” with all of this math looks like (double-click the image to enlarge it; click it again to enlarge further).
The IES was time consuming to create and is labor-intensive to maintain. SurveyUSA did not update the IES during off-year 2007, when SurveyUSA polled on comparatively few election contests. But we are actively updating it now for 2008. Our intention is to have a new release shortly, after the 03/04/08 presidential primaries, by which time the number of SurveyUSA separately polled election contests will have grown from 775 (as currently shown) to more than 800.
Blumenthal makes 2 other points:
That the error SurveyUSA shows for all polling firms, including SurveyUSA, is misleadingly large, according to some, thereby damaging, inadvertently, the credibility of all pollsters.
That SurveyUSA has created an unfair advantage for itself by being able to release election polls close to an election.
I will address both of these points in future posts.
Jay H. Leve