Live Blog for the election night and an analysis of the quality of my predictions after the fact are here
At this point, I am assuming the last Suffolk poll is my base for predictions. It correlates with Nate Silver's poll and although I am VERY suspicious about how Suffolks 2 day rolling average could change from 39.0 to 34.6 to 33.2 then back up to 37.4 over the past two days (250 voters surveyed per day have a VERY large sampling error) Suffolk is the most recent poll (which matters most in volatile primaries). (I exclude ARG's polls as their quality repuation is not good). The news has been bad for Romney in the past few days, so I can't believe a poll that implies at least an 8 point increase for Romney on Monday vs. Sunday is truly representing the electorate. However, Suffolk has an excellent polling reputation and it provides enough detail to let me do some adjustments where I feel they are needed (vs. polling averages and aggregation models which are so complex I have no idea how to analyze the fundamentals).
For reasons detailed here, I also assume that 56% of the electorate will be "Independent" which in New Hampshire is legally called "Undeclared".
BTW, despite hearing frequently in the media, that Indy's are not a factor in South Carolina, they too have an open primary and the Dems do not have an interesting primary, and moreover, I believe it is held at a later date. So Indy's should be an underrated factor in SC. But more on that after tonight.
I am also adjusting due to age distribution. As detailed here I believe the Suffolk poll is too heavily weighted towards older voters. I have decided to chose the well regarded Marist poll for age distribution in my forecast, though by reputation, I am also enticed by the even more well regarded Selzer poll. However since Selzer only polled NH once, over a month ago, I am not sure if the same quality holds as with their famous Iowa polls.
When the exit polls come out, I will compare this to see whether I guessed right. While I am not satisfied that I fully understand the dependency function (given age, what is the chance you will vote for candidate X), due to time constraints, I will assume that most recent Suffolk poll captures this function well enough. For South Carolina I will have more time to analyze this.
Finally, I realize that these variables aren't really additive. There are co-dependencies. Given the larger uncertainties in these polling adjustments, I feel that this mathematical sloppiness is smaller than the inaccuracy of the underlying data and underlying model. I also don't claim accuracy to 3 significant digits. I use a single decimal point mainly because I think the difference between 2nd and 3rd place could be VERY close.
For the 7.4% undecideds, I will make a gut feel call and give 1% to all the candidates who will get 3% of the vote or less, and 1/3 of the remainder to the candidates who may appeal to the values of the voter, but who are generally perceived by the voter as having no chance to win (e.g. Paul, Gingrich, Santorum) split allocated linearly according to their base polling % (so 1/3 * 6.4 * 17.6/(17.6+10.6+9=1. For the remaining 2/3 of that 6.4% given 2/3 of it to Huntsman who has momentum and 1/3 to Romney who has perception of being the odds on winner).
I wish I had more time to develop a reasonable method for allocating people who change their preference. Suffolk doesn't provide 2nd choice votes and I didn't have time (like I did for Iowa) to look at the PPP poll which does provide that info. We'll try harder for SC. So I went with my gut, which is informed by watching too much cable news.
So how did I get at the numbers at the top?
|Last Suffolk Poll||Candidate||Indy Split Change||Age Distribution Change||Undecided Allocation||Last Minute Candidate Switch||Total|