Sunday, January 31, 2016

Des Moines Register Likely Voters Not Likely

The end does not justify the means. Des Moines Register predicts a couple dozen elections fairly well, so no one questions the details anymore. They just consider it the Gold Standard and accept whatever it says as a matter of faith.

But math isn't based on faith.

As Jeff Roe points out, despite his clearly partisan motivations, the Des Moines Register called 3019 registered voters and determined that 602 were likely to caucus which is a 19.9% turnout. Traditionally turnout is 6% turnout. A 19.9% turnout would yield 386,000 GOP caucus voters when the highest all time was 131,000. In other words, not very likely.

Why would the Des Moines Register allow 602 people to represent Iowa?


In one case the math would render the Des Moines Register poll as unimportant, in another case it might bankrupt their polling budget, assuming they had the manpower to accomplish the task.

To get a 6% turnout of the 3019 registered voters they contacted, that would mean only 181 would provide the answers for their poll. A standard margin of error for such a small sample is 7.2%.

Can you imagine the media reporting on a poll with a 7% margin of error?

That would mean that Trump's numbers are 95% . to be somewhere between 21 and 35. Cruz would be somewhere between 16 and 30. Rubio would be somewhere between 8 and 22. In that scenario, an analyst could plausibly suggest that Rubio could win the Iowa Caucus.

The alternative math solution is to greatly increase the number of registered voters reached. For a 6% turnout that means reaching 10,033 registered voters. That would cost the Des Moines Register at least 3 times more money, with at least 3 times as many staff to contact them. Do they have that much staff? And with all the poll fatigue, it might take a lot more calling to reach a given registered voter. So is 4 times the cost and 4 times the callers an unreasonable requirement?

The Des Moines Register claims they aren't trying to predict turnout, but that is a screen to keep the discussion away from a huge margin of error or a huge cost of doing correct polling.

The fundamentals of statistical sampling assume your sample represents the general population. Who believes that people that are judged by the Des Moines Register are THREE TIMES MORE LIKELY to vote than the general population have views that represent the general population?

On the Democratic side the numbers also show questionable math.

They again accepted 602 (more on that suspicious number later) registered voters as likely to vote in the democratic caucus which again produces the 386,000 number of actual democratic caucus voters even though 2008's record SMASHING attendance had only 239,000 democratic caucus voters. A number 1/3 that size (130,000) seems more reasonable which would match (after inflating for population growth) the 2004 Democratic Caucus attendance when we had the revolutionary Dean, the eventual nominee Kerry, Speaker of the House Gephardt from next door Missouri and many others to bring out voters.

But cutting that much means just 203 likely voters with a 6.9% margin of error. This means analysts trying to guess whether Clinton's 45 really is a number between 38 and 52 and Sander's 42 really is a number somewhere between 35 and 49.

In other words, the analysts and media pundits would dismiss the Des Moines Register poll as just another poll result in a field of dozens of polls.

Or the Des Moines Register could triple or quadruple its staff to try to reach 10,000+ registered voters who would actually answer their phone call.

The most likely result is the Des Moines Register would not have the resources to contact that many registered voters so they would either be forced to report a too large margin of error, or hope that their credentials keep anyone from looking under the hood at their suspect math.

The last minor point is their choice of EXACTLY 602 people to vote in the GOP caucus and the same exact number in the democratic caucus. There is 0% reason to believe the size of the two parties caucuses will be the same. They Democrats have much larger caucuses whenever both parties have no incumbents. This forced choice of tallying the voices of more GOP likely voters than is going to happen is for appearances. They don't want to confuse data illiterate media analysts and pundits with HONEST MESSY data (which is normal in the real world). If they present two different margins of error, their might be questions. People might look closer at the data and their methods.

Finally, there are much worse polling methods used. The CNN one at Christmas time 2011 with 0% independents in its sample launched the viability of Rick Santorum. That was far worse than this Des Moines Register poll.

This post is mainly to point out that math counts. Changing the math to make things look good to the media does not.