Forget the Census undercount, what matters is bias

If enough people from a particular group don’t complete the Census, it can disrupt the data. Shutterstock

It is fair to say the 2016 Census hasn’t quite gone to plan.

Before Census night on August 9, there was a significant minority of Australians concerned about how names and address would be used, including a number of high-profile members of parliament.

Then, of course, there was the night of the Census itself and the now ubiquitous #censusfail.

We won’t know for a while what the impact will be on the quality of the data. There is already speculation that the response rate might be below the expected 98.3%, with some preemptively calling into account the reliability of the data.

The message from the Australian Bureau of Statistics (ABS) and government around response rates is as it should be though. There is still time to fill out the Census (either online or via paper), the data is still crucial, but the longer people leave it the less accurate the data will be.

Speculating on who and how many people are going to respond to the Census is a mug’s game. But it is useful to reflect on what the rate of undercount has been in the past, what the undercount might mean for decision making, and what can be done to adjust for it post-Census.

The ghosts of undercounts past

The most important thing to keep in mind throughout this period is that no Census has ever been perfect. There are no halcyon days where everyone filled out their Census on the allocated night, every form was filled out completely, honestly and accurately, and it was collected by ABS staff seamlessly and with no fuss.

In 2001, fresh out of university, I remember walking around the chilly Canberra suburbs as a Census collector. People then were confused about the point of the Census and how their data were to be used.

Some people were late, others were reluctant to hand over their form at all. Data from the 2001 Census ended up being crucial for policy debates over the intervening years.

But the response rate was not 100%.

Fast forward a decade and the Census before this one in 2011 also missed a large number of people. While undercount was low nationally, at 1.7%, a key point is that the undercount is not distributed evenly across the population.

In 2011, those who were more likely to be missed were young males. The ABS estimated that 7.8% of males aged 20-24 years were missed from the Census. Indigenous Australians and certain country-of-birth cohorts, in particular China and India, were also over-represented in the undercount.

How do we know who is missed?

An obvious question to ask is: how do we know who is missed from the Census? As an outsider, it can appear that most of the activity for the Census occurs on the night itself. In terms of people filling out the form, that is certainly the case. But the ABS actually spends a lot of its efforts processing and evaluating the results.

A key part of that evaluation is the Post-Enumeration Survey (PES). Undertaken by trained interviewers, the PES is:

[…] run shortly after each Census, to provide an independent measure of Census coverage. The PES determines how many people should have been counted in the Census, how many were missed, and how many were counted more than once. It also provides information on the characteristics of those in the population who have been missed or overcounted.


The implications of the undercount

Why are we worried about people not filling out their Census form? Clearly if hardly anyone filled out the form, we’d be in a lot of trouble. But what about if only 75% of people did, or 90% or 95%?

There is no magic percentage above which the Census is useful and below which we should chuck it out and start again. What really matters are the biases.

In some ways, it would be better if the ABS randomly lost a large number of Census forms than a much smaller, but non-random proportion of the population decided not to fill it out. Or worse, they intentionally gave incorrect information.

We can adjust for undercount, but bias is a bit harder. This is because the Census is not just used to count people, it is used to measure their distribution.

If people from low socioeconomic backgrounds are missed, it appears that we are richer than we actually are. If kids are missed from the Census, then we are less likely to invest in the schools and day care centres we need.

If people who are highly mobile don’t fill out their form, we are more likely to think that Australia’s population is spatially stable. If Indigenous Australians are missed, it makes it harder to assess the effectiveness of our policies and target the resources Indigenous Australians need.

If we care about these things, we should continue to encourage people to fill out their Census, using whatever mode they can.

It would be naive to suggest that response rates won’t be affected by the negative publicity and the difficulties some people had. But, prematurely predicting response rates is not helpful. We as a society still need people to participate in order to plan, and to hold government to account.