ETech Polls - Accurately polling single-seats for political clients

As Australia heads to the polls on Saturday it appears that the likelihood of a uniform national swing is ever decreasing. As a result, being able to predict the results in individual seats through single-seat polling, is an increasingly important way to determine a party or candidate’s likelihood to succeed.

Unfortunately, single-seat polling is seen to have significant accuracy issues. An analysis of single-seat polls in the 2016 election found that the published polls were so bad that they should have been treated as having a sample size one-sixth of the actual size, this would mean that a margin of error of plus or minus 3% should actually have been plus or minus 8.1% (making them effectively useless as a predictor).

Over the past 18 months, our team in Australia have been evolving and testing our robo-survey systems. From initial technology tests in the section 44 by-elections, through client led surveys in the Victorian and New South Wales State elections, we now have a suite of survey technologies and methodologies that we believe deliver more accurate results for clients and overcome many of the issues people normally identify with single-seat polling.

The results speak for themselves, particularly when it comes to measuring the primary votes of the candidates that ended up in the final two-candidate preferred counts (We’ve polled a lot of races where one of the final two candidates was not ALP / LNP, but instead where one was an independent or minor party candidate). On two-candidate preferred our methodology also delivers great results; for example, it predicted the result in the Mayo by-election correctly to within 0.3 percentage points.

The New South Wales State Election

Primary Vote Results

In single-seat polls carried out in the 11 days prior to the New South Wales State election and weighted through our automated algorithms, our average absolute error (the difference, positive or negative between our prediction of the primary vote and the actual result) for the LNP primary vote was 3.77%, well within the margin of error based on sample size for the polls themselves (which averaged 4.5%). The individual poll error generally decreased as we got closer to polling day.

For the Labor primary vote it was just 1.86%, for the Shooters primary vote it was 3.95% and for Greens it was 1.94%, the average absolute error across all non-LNP 2 party preferred candidates in our polls was just 2.57%.

Looking at the graphs below, there are some outliers - mainly caused by under-representative samples and an inability to go back into the field to improve this (either due to budgets or time), although we do also believe that there was a late swing in New South Wales with a lot of people changing their mind in the last few days of the campaign which also impacts this.

alt alt

Two-Party Preferred Results - New South Wales

A number of seats polled by our clients during the New South Wales election were three-cornered contests or involved a minor party two-candidate preferred. As a result, predicting preference flows required additional questions or in some cases weren’t possible to calculate, as the primary between two of the three candidates was too close to call.

Where we did calculate two-party preferred results, the average accuracy was within 1.9 percentage points. However, we did also have a number of seats where the two-party preferred was calculated correctly to within half a percent, and in every seat where we calculated two-party preferred we got the correct candidate winning.

How We Do It?

Improving Sample Representation

One of the challenges of single-seat polls is that a low response rate from particular groups can mean that your sample has to be heavily weighted or scaled up. In the simplest terms, this means that if a seat had 15% of voters aged between 18 and 34, but only 2.5% of your respondents are aged 18-34, then to “correct” for your under representative sample you need to scale up or weight your results. As a result, the response of one 18-34 year old would represent the results of 6 people in the final results. In a small sample size in a single state seat of say, five- or six-hundred people, this could mean a mere handful of people who may not at all represent everyone in their demographic group, heavily influencing the results.

Our survey technologies are built into our Voter.ID system, meaning that a political party or candidate can use their own data (boosted by additional licensed data from our partner Sensis) to target specific households and voters based on their known demographic. Our calling algorithms utilise response rates from previous survey in similar seats, combined with the up-to-date voter registration data held by a client, to place calls at the optimum time of day and to deliver the most representative sample possible.

Our platform also provides clients with response rates broken down by demographic group so they can make a judgement if a sample is unrepresentative or needs to be “topped up” by a particular group. We’ve been surprised to learn that many pollsters in Australia don’t provide this breakdown, instead just providing headline weighted/unweighted figures, making it difficult for a client to measure the potential accuracy of a poll themselves.

Overcoming Demographic Churn

Traditional polling relies on weighting a sample against published demographic data, which can become out of date (and becomes more out of date the further we move away from a census). However, because our clients use their own voter registration data, a sample can be weighted against the exact demographics of currently registered voters in an electorate.

Better Voter Matching

Again, because our clients are running their own surveys using their voter registration data, we can check if someone who answers the phone actually matches a voter on the roll associated with that number.

So, for example, if a landline number was matched to an address that contains two registered voters, one male aged 27 and one female aged 29 and the respondent tells us they’re a male aged between 30 and 40 then our systems will exclude that response from the results presented to a client. This makes it much less likely that someone from outside the electorate is included by accident.

Improving Questions

Thanks to the general lack of transparency around published polls it’s difficult to know exactly what questions are being used to obtain voting Intention results. Our approach has been to start from first principles. Through a combination of best practice research and A/B split testing we have now developed improved survey openers and a standard set of voting intention and squeeze questions that we believe deliver the most accurate results.

In particular, we have adopted a named candidate and political party approach, as identified in the UK as best practice. So rather than just asking which party someone “usually” votes for, which appears to be a standard question, we specifically ask respondents which of the following candidates for their party they will give their number 1 or primary vote to. Additionally, once ballot paper order is published we ensure that candidates are presented in the same order as the ballot to account for ballot order effects.

Instant turnaround and self-service

Our robo surveying systems are designed to be self-service where required. As clients use their own data they can if they wish to, program a survey script and schedule calls to be made on a particular date themselves, or we can support them in the process.

The automated nature of our weighting and prediction systems means that results are instantly available and there’s no manual intervention where mistakes can be made. If calls complete at 8pm then results are available to a client as soon as the last caller hangs up. This quick turnaround has seen surveys placed into the field and answers available within hours of a question needing to be answered.

We also present our results online in an interactive format, allowing clients to create their own sub-samples or cross tabs by just clicking on answers or demographic groups.