tag:theconversation.com,2011:/uk/topics/censusfail-30048/articles#CensusFail – The Conversation2021-08-10T01:57:34Ztag:theconversation.com,2011:article/1658062021-08-10T01:57:34Z2021-08-10T01:57:34ZWhy it’s unlikely there will be another #Censusfail tonight<figure><img src="https://images.theconversation.com/files/415336/original/file-20210810-23-1nsuofz.jpg?ixlib=rb-1.1.0&rect=49%2C0%2C5472%2C3645&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">
</span> <span class="attribution"><span class="source">Kaitlyn Baker/Unsplash</span>, <a class="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA</a></span></figcaption></figure><p>As the appointed hour for tonight’s census approaches, the question on many lips is: will it go smoothly, or will it be a repeat of the infamous 2016 #Censusfail? </p>
<p>Australians may remember the chaotic 40-hour shutdown suffered by the census website from 7:30pm on census night back in 2016. Fingers of blame were pointed in all directions, and the Australian Bureau of Statistics (ABS) suffered a heavy blow to its reputation. </p>
<p>A <a href="https://www.abs.gov.au/websitedbs/d3310114.nsf/Home/Assuring+Census+Data+Quality">forensic audit</a> later revealed multiple causal factors, not least of which was a series of malicious <a href="https://www.csoonline.com/article/3222095/ddos-explained-how-denial-of-service-attacks-are-evolving.html">“denial of service” (DDoS) attacks</a>. This type of attack aims to paralyse a website by bombarding it with too many requests at once.</p>
<h2>What happened in 2016?</h2>
<p>In essence, the online platform used in 2016 had insufficient built-in safeguards against DDoS attacks. This led to a hardware failure and the ultimate collapse of the system. </p>
<p>It is also possible the large number of legitimate access requests from people simply trying to complete their census contributed to the failure. The ABS later claimed the technology infrastructure was <a href="https://www.smh.com.au/politics/federal/bureau-of-statistics-looks-to-avoid-the-mistakes-of-censusfail-20190101-p50p33.html">inadequate</a> for the job at hand, despite assurances from its provider, IBM.</p>
<p>After the DDoS attacks, system monitors reported what appeared to be an unusually large amount of outbound traffic, which suggested confidential data were being exfiltrated. The ABS shut everything down to prevent further data loss. </p>
<p>It was later found that the unusual outbound traffic reading had been false. There was no loss of confidential data. </p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/drowning-by-averages-did-the-abs-miscalculate-the-census-load-63752">Drowning by averages: did the ABS miscalculate the Census load?</a>
</strong>
</em>
</p>
<hr>
<h2>How will 2021 be different?</h2>
<p>The 2021 census is being coordinated by <a href="https://www.pwc.com.au/">PricewaterhouseCoopers</a>, one of the largest professional services networks in the world. </p>
<p>Moreover, the online platform will run on <a href="https://www.zdnet.com/article/australian-2021-digital-census-to-be-built-on-aws/">Amazon Web Services</a>, by far the largest cloud computing services provider in the world. It has certified capability at handling “protected workloads”, which means the Australian Signals Directorate has signed off on its trustworthiness to host citizens’ data.</p>
<p>With these choices, the ABS has minimised the risk of a 2016 repeat. </p>
<figure class="align-center ">
<img alt="Hands using a laptop and smart phone at a desk" src="https://images.theconversation.com/files/415352/original/file-20210810-13-58onjv.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/415352/original/file-20210810-13-58onjv.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=401&fit=crop&dpr=1 600w, https://images.theconversation.com/files/415352/original/file-20210810-13-58onjv.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=401&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/415352/original/file-20210810-13-58onjv.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=401&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/415352/original/file-20210810-13-58onjv.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/415352/original/file-20210810-13-58onjv.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/415352/original/file-20210810-13-58onjv.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Protecting citizens’ data is paramount.</span>
<span class="attribution"><span class="source">Christina/Wocintechchat/Unsplash</span>, <a class="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA</a></span>
</figcaption>
</figure>
<p>Also providing advice on creating an all-round secure digital census platform is the <a href="https://www.cyber.gov.au/">Australian Cyber Security Centre</a> and the <a href="https://www.dta.gov.au/">Digital Transformation Agency</a>.</p>
<p>To pay for all of this, the ABS was allocated A$38.3 million over three years in the 2019-20 federal budget.</p>
<h2>Census website opened early</h2>
<p>By opening the census website on July 28, there will be less of a traffic spike on census night itself. </p>
<p>From July 28, Australians began receiving letters with their login ID and password. They could log in immediately to complete their censuses. </p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/census-2021-is-almost-here-whats-changed-since-censusfail-whats-at-stake-in-this-pandemic-survey-164784">Census 2021 is almost here — what's changed since #censusfail? What's at stake in this pandemic survey?</a>
</strong>
</em>
</p>
<hr>
<p>There have been informal reports that people have had difficulty logging on because it appeared from the letter that there were spaces in the sequence of nine characters that make up the password. The password was grouped into three lots of three characters on the letter. </p>
<p>But if the spaces are entered, the login fails. There should be no spaces in the password entered into the census website.</p>
<h2>What makes a website resilient?</h2>
<p>Resilient websites are those that are better able to withstand attacks in the first place, and — if a failure caused by excessive load or a cyber attack does happen — can recover with a minimum of downtime.</p>
<p>It is no great mystery how to do this. It is a matter of good engineering and ample resources. Around the world, there is a growing number of businesses whose livelihood depends on having a resilient website. Providers of web services like Amazon’s AWS and Microsoft’s Azure must guarantee these high levels of service, to win and keep these clients’ business. </p>
<p>This is the level of resilience the census platform is using.</p>
<h2>How will we know if 2021 is a success?</h2>
<p>2016 was Australia’s first digital census. It seems likely the <a href="https://www.smh.com.au/politics/federal/bureau-of-statistics-looks-to-avoid-the-mistakes-of-censusfail-20190101-p50p33.html">lessons</a> from that bumpy first outing have been learned. </p>
<p>Moreover, top-shelf service providers have been engaged, and sufficient funding secured. With the arrangements currently in place, we can expect tonight’s census to be a <a href="https://www.zdnet.com/article/australian-bureau-of-statistics-on-track-to-avoid-censusfail-2-0-come-august-10/">success</a>. </p>
<p>But there can be no absolute guarantees. We live in a world in which cyber-attacks from unfriendly <a href="https://www.abc.net.au/news/2021-01-11/australians-turning-point-on-cyber-security-cyberattacks-crime/13018884">nation states</a>, organised criminals, <a href="https://en.wikipedia.org/wiki/Hacktivism">hackivists</a> and garden-variety cyber-crooks are a daily occurrence. </p>
<p>The good news is that Australia’s ability to fend off this malicious disruption is improving every day.</p><img src="https://counter.theconversation.com/content/165806/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>David Tuffley does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>Switching web service providers and providing almost $40 million from the federal budget means the census 2021 website should be safe from crashing at the crucial time this evening.David Tuffley, Senior Lecturer in Applied Ethics & CyberSecurity, Griffith UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/1647842021-07-27T19:55:10Z2021-07-27T19:55:10ZCensus 2021 is almost here — what’s changed since #censusfail? What’s at stake in this pandemic survey?<figure><img src="https://images.theconversation.com/files/413072/original/file-20210726-23-1ftg1zp.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">
</span> <span class="attribution"><span class="source">David Crosling/AAP</span></span></figcaption></figure><p>Australian households will begin receiving instructions on how to fill out the 2021 census from early August. </p>
<p>The Census of Population and Housing is held every five years in Australia — and counts every person and household in Australia. But this is the <a href="https://www.nla.gov.au/research-guides/statistics/statistics-population-and-census-reports">first time</a> the count will be held during a global pandemic amid <a href="https://www.abc.net.au/news/2021-07-22/covid-19-lockdown-acts-of-kindness/100307768">lockdowns</a> and rising health and economic impacts of COVID-19.</p>
<p>Census data are crucial to what we know <a href="https://quickstats.censusdata.abs.gov.au/census_services/getproduct/census/2016/communityprofile/036?opendocument">about Australia</a>: who lives here, and how and where people live. Data from census informs vital services and infrastructure including, education, healthcare, transport, and welfare.</p>
<h2>Census 2021</h2>
<p>August 10 is the official census date, but things will be done a little differently in 2021. This year, Australia’s <a href="https://www.abs.gov.au/statistics/people/population/household-and-family-projections-australia/latest-release#what-if-">10 million households</a> will receive census login information or hard copy forms in the mail from next week.</p>
<p>The Australian Bureau of Statistics is encouraging people to complete the census as soon as they receive their instructions, if they know where they’ll be on August 10. In previous years you had to fill in your form on census night.</p>
<h2>The 2016 ‘fail’</h2>
<p>Australia’s last census was associated with great controversy stemming from the “<a href="https://www.abs.gov.au/ausstats/abs@.nsf/mediareleasesbyReleaseDate/EC8D47BE72A97E7ECA257E9A00131583?OpenDocument">digital-first</a>” strategy (where the majority of Australians would do the census online for the first time) and bureau plans to <a href="https://theconversation.com/census-2016-should-you-be-concerned-about-your-privacy-63206">keep names and addresses</a> for up to four years, to boost anonymous links with other data.</p>
<p>This was accompanied by federal politicians saying they would <a href="https://www.abc.net.au/news/2016-08-09/scott-ludlam-wont-put-name-on-census-form/7703380">refuse</a> to put their names on the census, citing privacy concerns, and a campaign <a href="https://www.news.com.au/finance/economy/australian-economy/drawing-a-dck-on-the-census-doesnt-make-you-cool-it-makes-you-a-dck/news-story/5bf4b008437a07ab152144e1e7c69386">to deface</a> census forms.</p>
<figure class="align-center ">
<img alt="A screen shot of a blocked census form in 2016." src="https://images.theconversation.com/files/413087/original/file-20210726-17-e2nf3o.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/413087/original/file-20210726-17-e2nf3o.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/413087/original/file-20210726-17-e2nf3o.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/413087/original/file-20210726-17-e2nf3o.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/413087/original/file-20210726-17-e2nf3o.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/413087/original/file-20210726-17-e2nf3o.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/413087/original/file-20210726-17-e2nf3o.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">The #censusfail in 2016 was a huge embarrassment for the federal government.</span>
<span class="attribution"><span class="source">Joel Carrett/AAP</span></span>
</figcaption>
</figure>
<p>Then came #censusfail. </p>
<p>Distributed denial of service attacks on census night saw the online questionnaire platform shut down and remain offline for nearly <a href="https://parlinfo.aph.gov.au/parlInfo/download/publications/tabledpapers/a41f4f25-a08e-49a7-9b5f-d2c8af94f5c5/upload_pdf/Review%20of%20the%202016%20eCensus%20-%20final%20report.pdf;fileType=application%2Fpdf#search=%22publications/tabledpapers/a41f4f25-a08e-49a7-9b5f-d2c8af94f5c5%22">two days</a>. </p>
<p>While data quality <a href="https://www.abs.gov.au/websitedbs/d3310114.nsf/home/Independent+Assurance+Panel/%24File/CIAP+Report+on+the+quality+of+2016+Census+data.pdf">was not</a> compromised, it was nevertheless a huge embarrassment for the bureau and the Turnbull government.</p>
<h2>What’s changed in terms of set-up?</h2>
<p>Lessons have since been <a href="https://parlinfo.aph.gov.au/parlInfo/download/publications/tabledpapers/a41f4f25-a08e-49a7-9b5f-d2c8af94f5c5/upload_pdf/Review%20of%20the%202016%20eCensus%20-%20final%20report.pdf;fileType=application%2Fpdf#search=%22publications/tabledpapers/a41f4f25-a08e-49a7-9b5f-d2c8af94f5c5%22">learned</a> and these are seen in preparations for Census 2021.</p>
<p>The new window to complete the census, rather than a one-night burst, will help ease online bottlenecks and external threats. It will also reduce pressure on the many Australians in lockdown, juggling paid work and home schooling.</p>
<figure class="align-center ">
<img alt="Commuters crowd into Town Hall station in Sydney." src="https://images.theconversation.com/files/413093/original/file-20210726-15-4y5mz0.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/413093/original/file-20210726-15-4y5mz0.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/413093/original/file-20210726-15-4y5mz0.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/413093/original/file-20210726-15-4y5mz0.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/413093/original/file-20210726-15-4y5mz0.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/413093/original/file-20210726-15-4y5mz0.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/413093/original/file-20210726-15-4y5mz0.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">The 2021 Census will collect information about more than 25 million Australians.</span>
<span class="attribution"><span class="source">Peter Rae/AAP</span></span>
</figcaption>
</figure>
<p>Neighbourhoods won’t be graced by an army of census workers, this time, either. The bureau is expecting <a href="https://www.abs.gov.au/media-centre/media-releases/2021-census-ready-operate-covid-safe-way">the overwhelming majority</a> of people to complete the census online, with reminders sent out by mail.</p>
<p>So the digital-first strategy that caused such a stir in 2016 was an important trial run for the contactless conditions necessary during a pandemic. Some other <a href="https://rtc-cea.cepal.org/sites/default/files/document/files/UNFPA_Census_COVID19_digital.pdf">countries</a> have <a href="https://www.scotlandscensus.gov.uk/news-and-events/">postponed</a> their national census programs (like Scotland) and even risked <a href="https://www.thejakartapost.com/news/2020/04/03/2020-census-extended-due-to-low-participation-covid-19-woes.html">COVID-19 exposure</a> by going ahead regardless (like Indonesia). But Australia’s preparations will enable a vital undertaking to continue safely.</p>
<h2>What’s changed in terms of the questions?</h2>
<p>According to the bureau, this year will include the “first significant changes to the information collected in the census since 2006”. (<a href="https://www.theguardian.com/business/grogonomics/2016/aug/11/lesson-of-censusfail-continued-funding-cuts-mean-agencies-cant-do-their-job">Funding cuts</a> since the 2001 have previously prohibited questionnaire refreshes.)</p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/census-2016-reveals-australia-is-becoming-much-more-diverse-but-can-we-trust-the-data-79835">Census 2016 reveals Australia is becoming much more diverse – but can we trust the data?</a>
</strong>
</em>
</p>
<hr>
<p>2021 will see new questions about long-term health conditions and defence force service. Sex beyond the binary of male/female will be also collected for the first time for all. These new additions to census have been made possible by the removal of the household internet connection question.</p>
<p>Improvements have also been made to better capture language and ancestry of First Nations Australians.</p>
<p>Census questions still have some way to go to better reflect contemporary Australia. But any changes to the census need to be understood by all.</p>
<p>Sexual orientation and <a href="https://www.australianpopulationstudies.org/index.php/aps/article/view/80">gender identity</a>, living in <a href="https://www.australianpopulationstudies.org/index.php/aps/article/view/75">more than one place</a>, and <a href="https://www.australianpopulationstudies.org/index.php/aps/article/view/82">ethnicity</a> are among improvements identified by demographers and social researchers for Census 2026, for example.</p>
<h2>What will we get out of Census 2021?</h2>
<p>The census has the power to say much about a nation and how populations are changing. While there will be no specific questions on COVID-19, the data will provide valuable insights into the impacts of the coronavirus on Australians. With the 2016 data now five years old, more up-to-date information is needed to make plans for the future.</p>
<p>With so many people in Australia in lockdown, the census will gauge the economic and social impacts of COVID-19 in a way no other data undertaking has been able to achieve yet. Individuals, communities and economic activities affected by COVID-19 will be reflected.</p>
<p>Census 2021 is no ordinary population survey – it will lay the foundation for Australia’s post-pandemic future by informing the nation’s social and economic recovery, including measuring the success of the vaccination rollout through improved population data. It’s more important than ever that we get this census right.</p>
<p>Results from Census 2021 will become available from <a href="https://www.abs.gov.au/statistics/research/2021-census-topics-and-data-release-plan">June next year</a>.</p>
<h2>The future of the census</h2>
<p>A number of countries, such as <a href="https://sites.nationalacademies.org/cs/groups/dbassesite/documents/webpage/dbasse_088800.pdf">The Netherlands</a>, have moved away from traditional census taking. Instead opting for data compilation performed using routine government data collected through administrative interactions. Like Medicare and Centrelink data being compiled by government for your <a href="https://theconversation.com/in-a-world-awash-with-data-is-the-census-still-relevant-70642">census submission</a>. </p>
<p>The Australian Statistician David Gruen, has foreshadowed <a href="https://www.afr.com/politics/federal/2021-census-could-be-australia-s-last-five-yearly-population-snapshot-20201207-p56l6n">such a possibility</a> for Australia. The <a href="https://www.bbc.com/news/uk-51468919">United Kingdom</a> is also thinking about it. This approach is a concern as it excludes individuals and communities from a vital participatory undertaking, and the data quality suffers as people can no longer self-report information.</p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/in-a-world-awash-with-data-is-the-census-still-relevant-70642">In a world awash with data, is the census still relevant?</a>
</strong>
</em>
</p>
<hr>
<p>In its current form, census data is accessible, and contributed to, by all. Australia’s census data enable everyone from researchers, to policymakers, to ordinary individuals the power to hold <a href="https://theconversation.com/why-has-victoria-struggled-more-than-nsw-with-covid-to-a-demographer-theyre-not-that-different-161996">government to account</a>.
It belongs to all of us.</p><img src="https://counter.theconversation.com/content/164784/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Liz Allen worked at the Australian Bureau of Statistics (ABS) between 2006 and 2007. Liz has no ongoing employment or financial links with the ABS. Liz is a user of ABS data for research purposes.</span></em></p>Census 2021 is no ordinary population survey – it will lay the foundation for Australia’s post-pandemic future.Liz Allen, Demographer, ANU Centre for Social Research and Methods, Australian National UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/1064402018-11-11T19:00:45Z2018-11-11T19:00:45ZThe promise and problems of including ‘big data’ in official government statistics<figure><img src="https://images.theconversation.com/files/244495/original/file-20181108-74783-18n6i26.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Official statistics help to shape a population’s sense of itself.</span> <span class="attribution"><a class="source" href="https://www.flickr.com/photos/freedomiiphotography/6620088231/in/photolist-b5ZFAg-9yzYX2-9asg9b-q89uvh-hGjD1o-4cTa1X-81FtLe-c4c9pj-o6ivAa-b47brz-qBVWwq-cmHxzy-HC9ibF-o8Vwrd-p6jXYu-9NF1UU-VkbJGC-9bQKLq-WTat1Y-aCPqgs-YcriLc-YguVCT-VbfGWj-beNPri-6aj9DG-qa9bQU-9X9ozH-6pPJRy-iMxXzU-7uczNL-gRSYzW-e9nx89-6pveSc-cUahNb-792wUY-99hG9D-oHKxtn-29EcDuA-X12yJZ-d5TeL3-UwFrUr-29mmZ5d-3JKT7K-MyjFo-ipyJF6-bArD5Y-beNPxa-Reu1Sg-HXnyPF-4stcKY">Izumo Taisha/Flickr</a>, <a class="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA</a></span></figcaption></figure><p>The Australian Bureau of Statistics (ABS) will soon <a href="http://www.abs.gov.au/census-consult">announce</a> the kinds of information it will collect in the next national census in 2021. If international trends are a guide, “<a href="https://arxiv.org/pdf/1309.5821.pdf">big data</a>” will comprise a growing part of ABS data collection and analysis. </p>
<p>This may promise greater timeliness and efficiency compared to the traditional paper-based census, but using big data to measure populations and economies is not without challenges. </p>
<p>Debates about how democratic governments should count the people they serve are ongoing in Australia, <a href="https://www.npr.org/2018/11/04/661932989/how-the-2020-census-citizenship-question-ended-up-in-court">the US</a> and in <a href="https://www.thehindu.com/opinion/lead/the-tools-for-counting/article24247791.ece">India</a>. The use of digital technologies for state measurement seems likely to intensify these debates as significant questions emerge around the practice.</p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/is-facebook-the-future-of-the-national-census-97018">Is Facebook the future of the national census?</a>
</strong>
</em>
</p>
<hr>
<h2>Public data gathering has high stakes</h2>
<p>For centuries, <a href="https://www.britannica.com/science/census">states have counted</a> and categorised people. Census data and other official statistics are used for government planning and <a href="https://www.aph.gov.au/binaries/library/pubs/bn/sp/schoolsfunding.pdf">budgeting</a>, to determine <a href="https://aec.gov.au/faqs/Redistributions.htm">political districts for elections</a>, and for many other purposes. Official statistics also help to shape a population’s sense of itself. For these reasons, state counting practices have often been controversial. </p>
<p>In Australia, <a href="http://abs.gov.au/ausstats/abs@.nsf/Lookup/2071.0Feature+Article3July+2011">changing census practice</a> has been a part of ongoing debate about ensuring First Nations people are properly representated. Historic undercounting of <a href="https://www.treatyrepublic.net/content/1967-referendum-important-facts-and-interesting-pieces-information">Aboriginal and Torres Strait Islander people</a> was redressed by the abandonment of language in the census that referred to blood quantums – which are now widely accepted as racist – alongside <a href="http://www.abs.gov.au/ausstats/abs@.nsf/cat/4708.0">other factors</a>.</p>
<p>In the US, state counting is likewise a matter of <a href="http://time.com/5217151/census-questions-citizenship-controversy/">intense dispute</a>. California is among those states currently <a href="https://www.brennancenter.org/legal-work/state-california-v-wilbur-l-ross-jr">suing</a> the US Federal Government because of a question about citizenship status the Trump administration has proposed adding to the 2020 Census. California argues fewer non-citizens will complete the census if the question is included. This would lead to a lower population count and reduced federal funding for states with high numbers of non-citizens. </p>
<p>India has also seen heated national debate about the <a href="https://www.dw.com/en/india-debates-the-inclusion-of-caste-in-census/a-5611305">gathering of caste data</a> and <a href="https://www.change.org/p/hon-ble-prime-minister-of-india-stop-regarding-housewives-under-category-of-non-worker-in-the-census-of-india-2021">the categorisation of “housewives” as non-workers</a>.</p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/can-the-census-ask-if-youre-a-citizen-heres-whats-at-stake-in-court-battles-over-the-2020-census-101170">Can the census ask if you're a citizen? Here's what's at stake in court battles over the 2020 census</a>
</strong>
</em>
</p>
<hr>
<h2>Big data use in official statistics is growing</h2>
<p>New issues of this kind are likely to emerge as government statistics offices around the world introduce digital data into their work. </p>
<p>The UN is currently spearheading efforts by member states to explore <a href="https://unstats.un.org/bigdata/">the use of new, digital data sources and technologies for official statistics</a>. The ABS is involved in this endeavour. Since late 2017, for example, the ABS has been <a href="http://www.abs.gov.au/ausstats/abs@.nsf/Latestproducts/6401.0.60.004Main%20Features802017">analysing supermarket scanner data to try to improve CPI (inflation) measurement</a>. </p>
<p>Other possibilities being explored for the use of digital data to improve state measurement include:</p>
<ul>
<li><p>using anonymised mobile phone data – bought from or donated by commercial providers – for <a href="http://tsf2016venice.enit.it/images/articles/Presentations/s1/1.3%20Analysing%20call%20detail%20records%20to%20support%20tourism%20statistics%20in%20Saudi%20Arabia%20-%20An%20Exploratory%20Study.pdf">tourism statistics</a>, to understand <a href="https://www.researchgate.net/publication/233346836_Inferring_patterns_of_internal_migration_from_mobile_phone_call_records_evidence_from_Rwanda">internal movement</a>, <a href="https://www.bbc.com/news/uk-41898318">commuter flows</a>, and <a href="http://publications.jrc.ec.europa.eu/repository/bitstream/JRC96568/lb-na-27361-en-n.pdf">population distribution</a>, and to try to <a href="https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-017-0099-3">estimate characteristics</a> of particular population sectors</p></li>
<li><p>web-scraping (extracting publicly available information from websites) to <a href="https://unstats.un.org/unsd/trade/events/2014/Beijing/presentations/day2/afternoon/4.%20Web%20scraping%20for%20Labour%20Statistics--Emanuele%20Baldacci.pdf">estimate labour force participation</a>, or using Google Trends to try to “nowcast” (get immediately up to date information) on <a href="http://ifsd.ca/web/default/files/Presentations/Reports/17012%20-%20Nowcasting%20Unemployment%20Rate%20with%20Google%20Trends%20-%20Final.pdf">unemployment</a></p></li>
<li><p>analysing satellite image and remote sensing data to <a href="http://publications.jrc.ec.europa.eu/repository/bitstream/JRC77375/lbna25643enn.pdf">estimate crop planting and predict harvest yield</a>. </p></li>
</ul>
<h2>The promise and the problems</h2>
<p>The aim of these efforts is to make official statistics more accurate, affordable to gather, and more attentive to geographically remote or otherwise marginalised communities. While there may be enormous potential to improve official statistics in these ways, big data use for state measurement raises thorny issues. </p>
<p>The first of these is the difficulty of auditing such data sources. All datasets come with blind spots and biases. Given the contentiousness of state counting, and the potentially high stakes of miscounting, it’s important the public maintains an overall sense of – and capacity to query – how, where, and why data is being collected. This may be difficult to ensure when data used for official measures are privately sourced. </p>
<p>While the ABS has the <a href="https://www.legislation.gov.au/Details/C2016C01005">legal right to compel the provision of information</a>, including from data providers, insight into how private companies collect and process data may be hard to obtain, and may not be shareable publicly. </p>
<p>Reliance on commercial data sources could also leave official statisticians dependent on privately owned infrastructure – cell tower infrastructure, for instance. The <a href="https://www.news.com.au/technology/gadgets/government-telstra-and-vodafone-boost-regional-phone-coverage/news-story/a4a949ce1734e1456c77772fd1381080">distribution</a> and maintenance of this infrastructure is driven by commercial interests, potentially working against the needs of responsible public data collection. </p>
<p>Another problem with the use of big data in official statistics is that data gathered are often not fit for the kinds of purposes states are pursuing. Data of this kind are messy and unstructured, and it can be hard to <a href="http://journals.sagepub.com/doi/abs/10.1177/1461444818769236">separate information from noise</a> in their analysis. Because machine-learning methods for unstructured data are never 100% accurate, any inferences drawn must be carefully validated. </p>
<p>Statisticians are well <a href="http://journals.sagepub.com/doi/full/10.1177/2053951714538417">aware of these limitations</a>, but face challenges communicating with policymakers and the <a href="https://digitalblog.ons.gov.uk/2018/03/29/challenges-communicating-statistics-wider-audience/">general public about them</a>. </p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/democracy-is-in-danger-when-the-census-undercounts-vulnerable-populations-93027">Democracy is in danger when the census undercounts vulnerable populations</a>
</strong>
</em>
</p>
<hr>
<h2>Enthusiasm must not outrun public engagement</h2>
<p>There is a risk that because digital data are relatively abundant, those in charge of state measurement practices will make use of that data without due regard to questions of what should, and should not, be measured for particular purposes. </p>
<p>Without knowing when and how they are being counted, the public cannot be part of that discussion. It is incumbent on governments to bridge that gap, and incumbent on all Australians to take an active interest in these practices as they develop.</p><img src="https://counter.theconversation.com/content/106440/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Fleur Johns, Wayne Wobcke and Caroline Compton are engaged in research related to the subject matter of this article that is funded in part through the Australian Research Council's Discovery Projects funding scheme (project DP180100903). The views expressed herein are those of the authors and not those of the Australian Government or the Australian Research Council.</span></em></p><p class="fine-print"><em><span>Caroline Compton is engaged in research related to the subject matter of this article that is funded in part through the Australian Research Council's Discovery Projects funding scheme (project DP180100903). The views expressed herein are those of the author and not those of the Australian Government or the Australian Research Council.
</span></em></p>Digital technologies put an abundance of data at our fingertips, but we must ensure questions of what should, and should not, be measured are answered before we use them in official statistics.Fleur Johns, Professor of Law and Associate Dean (Research), UNSW SydneyCaroline Compton, Postdoctoral research associate, UNSW SydneyWayne Wobcke, Associate Professor, UNSW SydneyLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/801432017-06-28T03:56:10Z2017-06-28T03:56:10ZCensus shows increase in children with disability, but even more are still uncounted<figure><img src="https://images.theconversation.com/files/175947/original/file-20170628-15714-77mlkw.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Some people with disabilities may not require government supports, meaning they wouldn't have been counted as having a disability in the Census. </span> <span class="attribution"><span class="source">from www.shutterstock.com.au</span></span></figcaption></figure><p>The 2016 Census has revealed an increase in the number of children with disability, up nearly 40,000 since 2011. One explanation is that the census now counts disability differently, which is more in line with the way many children and families view disability.</p>
<p>But other children continue to miss out on support because they do not name their needs as “disability”. And services don’t yet have adequate funding for even the revealed number of children, so other children who require assistance are left out. </p>
<p>A census that counts people who identify as having a disability, as well as those who need support, could help resolve these problems.</p>
<h2>Defining disability</h2>
<p>Children and young people who need support related to disability has risen from 2% to 2.6% of children - or 38,309 more children than in 2011. The most striking change is boys with disability aged 5-14 years, who have increased to 4.4% of all boys their age. These rates are <a href="http://www.censusdata.abs.gov.au/census_services/getproduct/census/2016/communityprofile/036?opendocument">even higher for Aboriginal and Torres Strait Islander</a> children - 7.4% of boys aged 5-14 years and 4.4% of all children and young people aged 0-19 years. </p>
<p>The <a href="http://www.abs.gov.au/ausstats/abs@.nsf/Lookup/2901.0Chapter702016">census counts disability as</a> “has need for assistance”, <a href="http://www.abs.gov.au/websitedbs/censushome.nsf/4a256353001af3ed4b2562bb00121564/ee5261c88952cf90ca257aa10005f567!OpenDocument">which it defines as</a> “profound or severe core activity limitation”. The definition was introduced in the 2006 Census to be <a href="http://www.who.int/classifications/icf/en/">consistent with international measures</a> and <a href="http://www.abs.gov.au/ausstats/abs@.nsf/mf/4430.0">other national surveys</a>, which focus on counting support needs. Before 2006, disability was not counted at all. The continued increase each census since 2006 is probably due to more Australians identifying with the definition or seeing the benefit of identifying as disabled, now that policies to support disability are changing.</p>
<p>Knowing who the definition covers is important. The census count of “need for assistance” is good to inform government planning about high levels of support some people need to participate equally in our communities. Estimating the number of people likely to need a National Disability Insurance Scheme (NDIS) package is a current priority. This census counted 562,629 people aged under 65 years – over 100,000 more than the <a href="https://www.ndis.gov.au/about-us/what-ndis.html">NDIS planning estimates</a>.</p>
<p>Equally important for children is planning access and support in school, playgrounds and other places where children participate in their families and communities. The higher 2016 Census count shows these plans need to expand.</p>
<h2>Who isn’t counted?</h2>
<p>The census question only counts people with high needs, not all people with disability. Unfortunately, the question is not complemented with an identity question about whether you have a disability. This means people with disability who do not need assistance – for example, some people who are blind – are not counted. The <a href="http://www.who.int/disabilities/world_report/2011/en/">World Health Organisation estimates</a> the larger total would be closer to 15% of all Australians, rather than the 5.1% measured in this census.</p>
<p>This gap means another 10% of Australians are not officially counted, yet they too face barriers to participation, <a href="http://www.tandfonline.com/doi/full/10.1080/15017419.2016.1222303">including access and attitudes</a>. </p>
<p>Disability advocates <a href="http://thestringer.com.au/census-fail-makes-disabled-australians-grin-a-bit-12107#.WVHr1YSGO71">consistently express concern</a> that by not asking Australians directly about their disability or impairment, the census fails to count the population of people with disability accurately – it only captures people who need assistance. </p>
<p>Fixing this gap is important for Australia’s obligations to all Australians under the <a href="http://www.un.org/disabilities/documents/convention/convoptprot-e.pdf">United Nations Convention on the Rights of Persons with Disabilities</a>. The NDIS relies on better access to social and economic life for all people with disability, including people not eligible for NDIS packages. </p>
<p>Not gathering information about this 10% of our population is a missed opportunity. It means we simply don’t know how many people with disability may benefit from, and contribute to, more accessible communities and new social and economic opportunities. For children, this is critical to having an inclusive community as a foundation. </p>
<h2>Views about disability</h2>
<p>Counting disability is complicated because it’s rarely the way children see themselves. Rather, <a href="https://academic.oup.com/cdj/article-abstract/50/4/724/349042/Building-belonging-and-connection-for-children?redirectedFrom=fulltext">they speak about what supports them</a> to feel a sense of belonging in their local school and community and what helps them build real friendships and relationships. They also talk about the barriers that make belonging difficult, like <a href="http://www.tandfonline.com/doi/abs/10.1080/13603116.2012.676081">loneliness, ill-treatment and lack of support</a>.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/175981/original/file-20170628-25846-11hb3jd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/175981/original/file-20170628-25846-11hb3jd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/175981/original/file-20170628-25846-11hb3jd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/175981/original/file-20170628-25846-11hb3jd.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/175981/original/file-20170628-25846-11hb3jd.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/175981/original/file-20170628-25846-11hb3jd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/175981/original/file-20170628-25846-11hb3jd.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/175981/original/file-20170628-25846-11hb3jd.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Children don’t define themselves by their disability, but rather what makes them feel supported and included.</span>
<span class="attribution"><span class="source">from www.shutterstock.com</span></span>
</figcaption>
</figure>
<p>Children and young people with disability are <a href="http://www.sciencedirect.com/science/article/pii/S027795360800275X?via%3Dihub">often positioned as passive recipients</a> needing assistance through family, friends and services. <a href="https://www.routledge.com/Children-Young-People-and-Care/Horton-Pyer/p/book/9781138920880">Research</a> with children and young people themselves, however, shows they want to be recognised for their active contribution to their families and wider networks. Their positive identity is more important to them than their support needs.</p>
<p>One of the interesting changes since the introduction of the NDIS is that families and service providers are now also using the “need for assistance” definition of disability, which is consistent with the inclusive vision from the UN Convention. Their advocacy with this definition means <a href="https://www.sprc.unsw.edu.au/media/SPRCFile/ECI_Review_Final_Report.pdf">support for young children in Australia</a> has expanded already even though the NDIS is still growing.</p>
<p>Children receiving disability support are now more likely to use it while they are with other children in their community, rather than in separate services. Families’ capacity to demand these inclusive services recognises the rights of their children to <a href="https://www.un.org/development/desa/disabilities/convention-on-the-rights-of-persons-with-disabilities/article-7-children-with-disabilities.html">get the support they need to enjoy their childhood</a> and have the same options as their peers in the future. These trends are also consistent with the insurance approach of the NDIS: that assistance now is an investment for later.</p>
<h2>Funding and support</h2>
<p>The increase in the numbers of children and young people with disability may reflect families’ optimism about having their children’s needs met in the new NDIS world. It certainly promises to replace long waiting lists and capped places of previous systems. The census numbers reinforce the higher number of children in the NDIS than expected, <a href="http://www.pc.gov.au/inquiries/current/ndis-costs/position">which is upsetting NDIS estimates</a>. The NDIS has detailed data about people using the scheme. This will not resolve the question about the total number of people with disabilities though. People receiving NDIS packages are those likely to already be those identified in census data as needing support.</p>
<p>Data collection in schools has also recently improved with the introduction of the <a href="http://www.schooldisabilitydatapl.edu.au/data-collection-steps/introduction-to-the-steps">Nationally Consistent Collection of Data</a> for school students with disability. Most children and young people participate in the school system, so these data will inform understanding about adjustments to support students in their education. </p>
<p>Bringing these large data sets together means we can understand the types of supports families need, and where there are service gaps between schools and the NDIS. </p>
<p>Lessons from data need to be discussed alongside the expectations and experiences of children, young people and families to ensure they’re getting the support they need. This will help children enjoy the opportunities of childhood, rather than the current disproportionate but necessary focus on dismantling barriers to belonging.</p><img src="https://counter.theconversation.com/content/80143/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Karen R Fisher receives funding from the Australian Research Council and state and federal governments. </span></em></p><p class="fine-print"><em><span>Sally Robinson receives research funding from the Australian Research Council and state and federal governments. </span></em></p>The census needs to count people who identify as having a disability, as well as those who require government support.Karen R Fisher, Professor, Social Policy Research Centre, UNSW SydneySally Robinson, Senior Research Fellow, Centre for Children and Young People, Southern Cross UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/706422017-01-05T21:06:57Z2017-01-05T21:06:57ZIn a world awash with data, is the census still relevant?<p><em>How we track our economy influences everything from government spending and taxes to home lending and business investment. In our series <a href="https://theconversation.com/au/topics/the-way-we-measure-34466">The Way We Measure</a>, we’re taking a close look at economic indicators to better understand what’s going on.</em></p>
<hr>
<p>The Australian Census came under intense scrutiny in the wake of <a href="https://theconversation.com/au/topics/censusfail-30048">#censusfail</a>. Parliament conducted a <a href="http://parlinfo.aph.gov.au/parlInfo/download/publications/tabledpapers/a41f4f25-a08e-49a7-9b5f-d2c8af94f5c5/upload_pdf/Review%20of%20the%202016%20eCensus%20-%20final%20report.pdf;fileType=application%2Fpdf#search=%22publications/tabledpapers/a41f4">review</a>, the Senate an <a href="http://www.aph.gov.au/Parliamentary_Business/Committees/Senate/Economics/2016Census/Report">inquiry</a>, and <a href="http://www.theaustralian.com.au/opinion/columnists/judith-sloan/census-2016-we-dont-need-it-so-why-persist/news-story/a21d99e9c968136442a141fd1a80ea16">some in the media</a> questioned the entire point. </p>
<p>But cost and <a href="https://theconversation.com/census-2016-should-you-be-concerned-about-your-privacy-63206">privacy concerns</a> aside, population is one of the three <a href="http://www.treasury.gov.au/%7E/media/Treasury/Publications%20and%20Media/Publications/2015/2015%20Intergenerational%20Report/Downloads/PDF/2015_IGR.ashx">pillars</a> of the economy. </p>
<p>Understanding population characteristics is vital to inform us of challenges and opportunities, and is a necessary input in other economic indicators. The quality and timely population data found in the census is not gathered through any other means. If changes need to be made, it’s in the discussion around the census.</p>
<h2>So we know who is where</h2>
<p>The census is unique in that it is a total survey of the population, covering a range of social and economic variables. At present, it is the only way such data is obtained in Australia. </p>
<p>Without the census, we wouldn’t know how many we are, who we are and where we live. This means important planning and policy issues couldn’t be addressed. The location of schools and hospitals, provision of medical facilities, funding for major infrastructure would all be done without an accurate idea of who is where. </p>
<p>In fact, local, state and federal governments rely heavily on data only available in the census. The number of children, working age population, travel to work information, occupations, housing suitability and vulnerable populations is all data only found in the census.</p>
<p>The census also allows for sub-national analyses to be performed, particularly <a href="https://www.legislation.gov.au/Details/C2015C00247">legislated</a> population estimates and projections. These estimates form the basis of economic indicators such as labour force statistics and <a href="https://theconversation.com/its-good-the-government-will-report-gdp-per-capita-but-it-shouldnt-stop-there-69638">gross domestic product per capita</a>.</p>
<p>The estimates and projections also highlight <a href="http://www.theage.com.au/comment/the-age-editorial/victorias-twospeed-economy-slumping-regions-need-urgent-policy-response-20161208-gt71z6.html">inequalities</a> within society, and provide opportunities for policy responses and development at a regional level.</p>
<p>But the purpose of taking a <a href="http://unstats.un.org/unsd/demographic/meetings/egm/NewYork/2014/P&R_Revision3.pdf">census</a> goes beyond informing resource allocation, taxation and electoral representation. </p>
<p>The statistical benchmarks used in surveys and studies, research and analysis, and, most importantly, lower level aggregates and groups of interest can only be informed by census data. Low level aggregates allow identification of need. Identification of areas with high proportions of young people who cannot access employment or education can provide much insight into barriers to economic participation.</p>
<p>Quality information about homelessness, minorities, and Indigenous populations is only truly obtained via a census.</p>
<h2>The data we already collect won’t do</h2>
<p>One of the <a href="http://www.theaustralian.com.au/opinion/columnists/judith-sloan/census-2016-we-dont-need-it-so-why-persist/news-story/a21d99e9c968136442a141fd1a80ea16">arguments against the census</a> is that we can get the same data elsewhere from the multitude of service providers that already come in contact with the public. </p>
<p>The problem is that these data collections are administrative. They’re collected for a reason and with limited scope. </p>
<p>Centrelink data is collected to provide a service. Information we provide to the tax office ensures tax compliance. Medicare doesn’t keep information about overseas nationals and people who have never had their birth registered, which is an issue in remote and Indigenous communities.</p>
<p>Australia’s large immigrant population would become a blind spot if we were to rely on the data the government already collects, as many aren’t eligible for certain government services. The data collected by Centrelink, the tax office and Medicare don’t provide sufficient scope. So far the census is the only data source that fits the bill.</p>
<h2>Some alternatives</h2>
<p>Population registers are a viable alternative to our five-yearly censuses. Finland uses a <a href="http://vrk.fi/en/population-information-system">computerised system</a> to record population data including births, deaths, marriages, migration and so on. The Netherlands, on the other hand, conducts a <a href="https://www.youtube.com/watch?v=SLpDkcyenf0">virtual census</a> by pulling together digital data from a number of different sources. </p>
<p>These registers offer real-time data, but they require ongoing maintenance and verification and often exceed the <a href="http://unstats.un.org/unsd/demographic/meetings/egm/NewYork/2014/P&R_Revision3.pdf">cost</a> of our census. Ironically, they also need to be checked against a census. And Germany’s experience shows population registers are not always <a href="http://www.nytimes.com/2013/06/01/world/europe/census-shows-new-drop-in-germanys-population.html">accurate</a>. </p>
<p>Further, major legislation changes would have to go through for Australia to be able to pool data like this. The establishment of a national population register would be costly and demand interdepartmental government coordination.</p>
<p>We could also look to the United States’ method of conducting surveys in between a 10-yearly census. This mixed methodology was suggested by the ABS in <a href="http://www.smh.com.au/federal-politics/political-news/abbott-government-considers-axing-the-australian-census-to-save-money-20150218-13ieik">2015</a> to cut costs.</p>
<p>However, limited financial upside, together with lower quality data, makes it a risky alternative for Australia. Plus we shouldn’t think of the census as an unrecoverable cost. The Office of National Statistics in the United Kingdom estimated the costs of their 2011 census were <a href="https://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.41/2015/mtg1/D1_1110_Value_of_the_census__ONS_.pptx">recovered</a> in just over a year.</p>
<h2>The future is data</h2>
<p>So how can we improve our census?</p>
<p>Online census completion will save money, improve data quality and reduce data processing time. However, online collection must be balanced to ensure disadvantaged populations aren’t excluded. The end of the census collector hasn’t arrived just yet.</p>
<p>More importantly, we must define contemporary data needs moving into the future. An informed public conversation about migration, employment, families and our changing population is much needed to gain <a href="http://datafutures.co.nz/our-work-2/talking-to-new-zealanders/social-licence/">social licence</a> to collect and use relevant data. </p>
<p>Whether the methodology of census continues as is or we introduce an alternative method of data collection, the key going forward is the question of legitimacy. Steps must be taken to justify the need to take a census, and to assuage privacy and security concerns. Without social license we’ll see the failings of the 2016 census play out over and over again.</p>
<p>Australia’s future relies on strong evidence we can agree on. This isn’t solely the domain of researchers. We all have a stake.</p><img src="https://counter.theconversation.com/content/70642/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Dr Liz Allen worked at the Australian Bureau of Statistics (ABS) between 2006 and 2007. Liz has no ongoing employment or financial links with the ABS. Liz is a user of ABS data for research purposes.</span></em></p>The Australian Census has been taken since 1911. But is it still necessary in today’s world of mass digital data collection?Liz Allen, Postdoctoral Fellow, Centre for Aboriginal Economic Policy Research, Australian National UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/703962016-12-15T04:21:58Z2016-12-15T04:21:58ZServer down: what caused the ATO systems to crash<figure><img src="https://images.theconversation.com/files/150237/original/image-20161215-2478-i7ug7k.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">The ATO crash didn't involve a fire, but it almost looked that bad for a while.</span> <span class="attribution"><span class="source">Shutterstock</span></span></figcaption></figure><p>Many Australian Tax Office IT systems have been unavailable for days after a major fault, apparently caused by a problem with a large-scale storage server.</p>
<p><div data-react-class="Tweet" data-react-props="{"tweetId":"808196645603000321"}"></div></p>
<p>The ATO’s online systems, including its public website and portals for taxation agents, were down for several days. At the time of writing, the ATO <a href="http://lets-talk.ato.gov.au/ato-systems-update">reports</a> that most services are now operational but may experience slowdowns.</p>
<p>There were also reports that up to <a href="http://www.lifehacker.com.au/2016/12/ato-website-restored-after-two-days-one-petabyte-of-data-lost/">one petabyte of data was affected</a> by the fault. The ATO has reported that <a href="http://lets-talk.ato.gov.au/ato-systems-update">no taxpayer data have been lost</a>, although it is unclear as to whether any internal data have been lost.</p>
<h2>Outage in a SAN</h2>
<p>According to the ATO and media reports, the system outage was caused by a failure in a <a href="https://www.hpe.com/au/en/storage/3par.html">3PAR StoreServe</a> storage area network (SAN) made by Hewlett Packard Enterprise (HPE).</p>
<p>These devices contain racks full of hard disks and/or solid-state storage devices to store data on a gargantuan scale, and fast network interfaces to provide that data to the various “application servers” that provide the ATO’s online systems.</p>
<p>The two units purchased by the ATO were reportedly capable of storing up to a petabyte – that’s 1,000 terabytes or 1 million gigabytes – of data each. They would have cost hundreds of thousands of dollars.</p>
<p>While these devices are expensive, they allow IT staff to allocate storage efficiently and flexibly to where it is needed, and thus (in theory) can improve reliability.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/150253/original/image-20161215-30552-1q4tfpu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/150253/original/image-20161215-30552-1q4tfpu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/150253/original/image-20161215-30552-1q4tfpu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=434&fit=crop&dpr=1 600w, https://images.theconversation.com/files/150253/original/image-20161215-30552-1q4tfpu.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=434&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/150253/original/image-20161215-30552-1q4tfpu.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=434&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/150253/original/image-20161215-30552-1q4tfpu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=546&fit=crop&dpr=1 754w, https://images.theconversation.com/files/150253/original/image-20161215-30552-1q4tfpu.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=546&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/150253/original/image-20161215-30552-1q4tfpu.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=546&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Even Hewlett Packard Enterprise’s state of the art storage system was vulnerable to data corruption.</span>
<span class="attribution"><span class="source">Hewlett Packard Enterprise</span></span>
</figcaption>
</figure>
<h2>Multiple levels of redundancy, made redundant</h2>
<p>Entrusting so much of the IT operations of a large organisation like the ATO to a single storage server requires a high degree of confidence that it will function reliably. As such, a number of levels of redundancy are incorporated into this kind of storage system.</p>
<p>As a first protection against a failure of a single disk (or solid-state storage device), data are “mirrored” across multiple physical disks. If monitoring systems detect a failure, operations can fall back on the mirrored data. </p>
<p>The faulty disk can be replaced and the full mirror restored, all without interrupting user operations. High-end systems such as these also incorporate redundancy into their controller electronics. </p>
<p>However, if a major hardware failure occurs, such as a power failure that is not covered by a backup power supply, many such systems have a second level of redundancy. The entire contents of the SAN is “mirrored” to a second system, often in another physical location, and systems switch over to the backup automatically.</p>
<p><a href="http://www.itnews.com.au/news/hpe-storage-crash-killed-ato-online-services-444490">According to iTnews</a>, all of this redundancy was made moot by the nature of the problem: corrupted data were being written to the SAN for some reason, and this corrupted data were then mirrored to the backup SAN.</p>
<p>In this situation, all the redundancy within and between the SANs does not help, as the bad data were replicated across the entire system. This is why keeping traditional backup snapshots – copies of data as it previously existed in the system – is so important, regardless of any amount of mirroring.</p>
<p>The ATO appears to have comprehensive backups of the stored data; however, restoring all of it and returning the SANs to an operational configuration has had to be done manually. It is not surprising that this has taken several days to complete.</p>
<h2>Assessing the ATO’s response</h2>
<p>While it is tempting to pile on to another <a href="https://theconversation.com/au/topics/censusfail-30048">large-scale government IT failure</a>, a fair assessment should take into consideration the nature of the failure and the ATO’s response.</p>
<p>Firstly, it appears that the ATO heeded one of the key lessons from the Census website meltdown and communicated what was going on to the public effectively. It responded to the failures by providing <a href="https://twitter.com/ato_gov_au">informative updates on social media</a> and more comprehensive information on a <a href="http://lets-talk.ato.gov.au/ato-systems-update">functioning part of its website</a>.</p>
<p>Secondly, it appears that its backup strategy was sufficient to get all systems back up and running without data loss, despite a nearly worst-case failure in their primary storage system.</p>
<p>If its incident response can be criticised, it may have been able to restore services much faster if more of that process had been automated. However, this appears to be a highly unusual incident. </p>
<p>Restoring one set of application data due to corruption caused by the application itself is a relatively common situation. Restoring many different sets of data because of an apparent bug in the storage server is extremely rare.</p>
<p>Furthermore, while few people ever see them, SANs like this are very common devices in data centres. They provide a generic low-level storage service and are expected to provide it highly reliably. </p>
<p>Indeed, HPE markets its enterprise storage systems with a “<a href="http://www8.hp.com/au/en/products/data-storage/3par-6nines.html">99.9999% uptime guarantee</a>”, which requires that a device is non-operational for no more than 30 seconds per year. </p>
<p>Over the past few days, the IT staff at the Australian Tax Office have probably had a few sleepless nights. It’s likely that engineers at HPE will have a few more trying to get to the bottom of why their enterprise storage system seems to have failed so comprehensively.</p><img src="https://counter.theconversation.com/content/70396/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Robert Merkel does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>The ATO system crash was unusual, but it was handled as well as could be expected.Robert Merkel, Lecturer in Software Engineering, Monash UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/639972016-08-17T03:16:48Z2016-08-17T03:16:48ZForget the Census undercount, what matters is bias<figure><img src="https://images.theconversation.com/files/134355/original/image-20160816-13035-14psstg.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">If enough people from a particular group don't complete the Census, it can disrupt the data.</span> <span class="attribution"><span class="source">Shutterstock</span></span></figcaption></figure><p>It is fair to say the 2016 Census <a href="https://theconversation.com/abss-night-of-disaster-as-servers-crash-and-millions-fail-to-complete-the-census-63737">hasn’t quite</a> gone to plan. </p>
<p>Before Census night on August 9, there was a significant minority of Australians concerned about how <a href="https://theconversation.com/benefits-of-the-census-retaining-names-and-addresses-should-outweigh-privacy-fears-57223">names and address would be used</a>, including a number of high-profile <a href="http://www.abc.net.au/news/2016-08-09/senators-could-be-prosecuted-over-census-revolt-abs-says/7710750">members of parliament</a>. </p>
<p>Then, of course, there was the <a href="https://theconversation.com/root-of-census-failures-say-badly-done-ibm-and-abs-still-down-for-some-63845">night of the Census itself</a> and the now ubiquitous <a href="https://twitter.com/search?q=%23CensusFail&src=tyah">#censusfail</a>. </p>
<p>We won’t know for a while what the impact will be on the quality of the data. There is already <a href="http://www.theaustralian.com.au/national-affairs/security-officials-find-ibm-failings-in-census-collapse/news-story/acb1f2a36e7e715bd2cfdf750b113ea7">speculation</a> that the response rate might be below the expected 98.3%, with some preemptively calling into account the reliability of the data.</p>
<p>The <a href="http://www.skynews.com.au/news/top-stories/2016/08/15/abs-not-considering-backup-census-plan.html">message</a> from the Australian Bureau of Statistics (ABS) and government around response rates is as it should be though. There is still time to fill out the Census (either online or via paper), the data is still crucial, but the longer people leave it the less accurate the data will be.</p>
<p>Speculating on who and how many people are going to respond to the Census is a mug’s game. But it is useful to reflect on what the rate of undercount has been in the past, what the undercount might mean for decision making, and what can be done to adjust for it post-Census.</p>
<h2>The ghosts of undercounts past</h2>
<p>The most important thing to keep in mind throughout this period is that no Census has ever been perfect. There are no halcyon days where everyone filled out their Census on the allocated night, every form was filled out completely, honestly and accurately, and it was collected by ABS staff seamlessly and with no fuss. </p>
<p>In 2001, fresh out of university, I remember walking around the chilly Canberra suburbs as a Census collector. People then were confused about the point of the Census and how their data were to be used. </p>
<p>Some people were late, others were reluctant to hand over their form at all. Data from the 2001 Census ended up being crucial for policy debates over the intervening years.</p>
<p>But the response rate was not 100%. </p>
<p>Fast forward a decade and the Census before this one in 2011 also missed a large number of people. While undercount was low nationally, at 1.7%, a key point is that the undercount is not distributed evenly across the population. </p>
<p>In 2011, those who were more likely to be missed were young males. The ABS estimated that 7.8% of males aged 20-24 years were missed from the Census. Indigenous Australians and certain country-of-birth cohorts, in particular China and India, were also over-represented in the undercount.</p>
<h2>How do we know who is missed?</h2>
<p>An obvious question to ask is: how do we know who is missed from the Census? As an outsider, it can appear that most of the activity for the Census occurs on the night itself. In terms of people filling out the form, that is certainly the case. But the ABS actually spends a lot of its efforts processing and evaluating the results. </p>
<p>A key part of that evaluation is the Post-Enumeration Survey (<a href="http://www.abs.gov.au/websitedbs/censushome.nsf/home/factsheetspes?opendocument&navpos=450">PES</a>). Undertaken by trained interviewers, <a href="http://www.abs.gov.au/AUSSTATS/abs@.nsf/ProductsbyCatalogue/9E2E16CFF2CF31A6CA2570A50083C371?OpenDocument">the PES is</a>: </p>
<blockquote>
<p>[…] run shortly after each Census, to provide an independent measure of Census coverage. The PES determines how many people should have been counted in the Census, how many were missed, and how many were counted more than once. It also provides information on the characteristics of those in the population who have been missed or overcounted. </p>
</blockquote>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/134356/original/image-20160816-13028-iie5ye.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/134356/original/image-20160816-13028-iie5ye.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/134356/original/image-20160816-13028-iie5ye.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=338&fit=crop&dpr=1 600w, https://images.theconversation.com/files/134356/original/image-20160816-13028-iie5ye.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=338&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/134356/original/image-20160816-13028-iie5ye.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=338&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/134356/original/image-20160816-13028-iie5ye.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=424&fit=crop&dpr=1 754w, https://images.theconversation.com/files/134356/original/image-20160816-13028-iie5ye.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=424&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/134356/original/image-20160816-13028-iie5ye.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=424&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption"></span>
<span class="attribution"><span class="source">ABS</span></span>
</figcaption>
</figure>
<h2>The implications of the undercount</h2>
<p>Why are we worried about people not filling out their Census form? Clearly if hardly anyone filled out the form, we’d be in a lot of trouble. But what about if only 75% of people did, or 90% or 95%? </p>
<p>There is no magic percentage above which the Census is useful and below which we should chuck it out and start again. What really matters are the <em>biases</em>. </p>
<p>In some ways, it would be better if the ABS randomly lost a large number of Census forms than a much smaller, but non-random proportion of the population decided not to fill it out. Or worse, they intentionally gave incorrect information. </p>
<p>We can adjust for undercount, but bias is a bit harder. This is because the Census is not just used to count people, it is used to measure their distribution. </p>
<p>If people from low socioeconomic backgrounds are missed, it appears that we are richer than we actually are. If kids are missed from the Census, then we are less likely to invest in the schools and day care centres we need. </p>
<p>If people who are highly mobile don’t fill out their form, we are more likely to think that Australia’s population is spatially stable. If Indigenous Australians are missed, it makes it harder to assess the effectiveness of our policies and target the resources Indigenous Australians need.</p>
<p>If we care about these things, we should continue to encourage people to fill out their Census, using whatever mode they can. </p>
<p>It would be naive to suggest that response rates won’t be affected by the negative publicity and the difficulties some people had. But, prematurely predicting response rates is not helpful. We as a society still need people to participate in order to plan, and to hold government to account.</p><img src="https://counter.theconversation.com/content/63997/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Nicholas Biddle was employed by the Australian Bureau of Statistics from 2001 to 2007. He no longer receives any funding from the ABS, but uses Census data for research and analysis</span></em></p>If the response rate to the 2016 Census is lower than expected, it could compromise our ability to draw meaningful information from the data.Nicholas Biddle, Fellow, Australian National UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/637552016-08-10T05:35:34Z2016-08-10T05:35:34ZDid the Census really suffer a denial-of-service ‘attack’?<figure><img src="https://images.theconversation.com/files/133630/original/image-20160810-11006-7mecti.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">What really caused the Census servers to crash?</span> <span class="attribution"><span class="source">Shutterstock</span></span></figcaption></figure><p>Last night, the Australian Bureau of Statistics (<a href="http://www.abs.gov.au/">ABS</a>) closed the <a href="https://stream10.census.abs.gov.au/eCensusWeb/welcome.jsp#top2">2016 Census</a> website. No explanation was given at the time, except for a message on the page saying “the system is very busy at the moment”.</p>
<p>This morning, the ABS’s head statistician, David Kalisch, <a href="http://www.abc.net.au/news/2016-08-10/australian-bureau-of-statistics-says-census-website-hacked/7712216">announced</a> that the site had been brought offline by four distributed denial-of-service (DDoS) attacks. </p>
<p>The minister responsible, Michael McCormack, later qualified these statements by stating the site was not “attacked”, per se. While this is a semantic quibble, it is accurate in the sense that a DDoS “attack” in itself is not an attempt to gain access or subvert information. </p>
<p>The prime minister’s cyber security advisor, Alastair MacGibbon, added that a number of <a href="http://www.smh.com.au/federal-politics/political-news/malcolm-turnbull-defends-handling-of-census-as-privacy-commissioner-investigates-20160810-gqp45u.html">technical issues</a> compounded the effects of the attack, including the <a href="http://www.abc.net.au/news/2016-08-10/census-night-how-the-shambles-unfolded/7712964">failure of the ABS’s geoblocking system</a> at around 7.30pm, which allowed the DDoS traffic to impact the ABS servers, hosted by IBM.</p>
<p>However, it has also been pointed out that the ABS may simply have been <a href="https://theconversation.com/census-website-cracks-after-malicious-attack-by-hackers-63734">unprepared</a> for the volume of traffic it received on census night.</p>
<p>So how plausible is the claim that the census was brought down by a DDoS attack?</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/133626/original/image-20160810-18037-108qewk.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/133626/original/image-20160810-18037-108qewk.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/133626/original/image-20160810-18037-108qewk.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=125&fit=crop&dpr=1 600w, https://images.theconversation.com/files/133626/original/image-20160810-18037-108qewk.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=125&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/133626/original/image-20160810-18037-108qewk.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=125&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/133626/original/image-20160810-18037-108qewk.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=157&fit=crop&dpr=1 754w, https://images.theconversation.com/files/133626/original/image-20160810-18037-108qewk.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=157&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/133626/original/image-20160810-18037-108qewk.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=157&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">This is all the information users were given on Census night.</span>
<span class="attribution"><span class="source">ABS</span></span>
</figcaption>
</figure>
<h2>Attacking availability</h2>
<p>Confidentiality, integrity and availability are the basic principles of information security. </p>
<p>Cyber attacks are commonly mounted against each of these principles, with the ABS claiming that its server availability was the target last night.</p>
<p>A conventional attack against availability is denial-of-service (DoS). A DoS attack occurs when a system (such as a website) is flooded with carefully crafted requests such that requests from legitimate users cannot be serviced, thus causing the “denial” of service. </p>
<p>A DDoS, or distributed DoS, occurs when many systems are used to perform a DoS attack on a target. This makes it harder to counter, as the server operator cannot simply block a single system on the internet that is sending all of the spurious requests. Thus a DDoS is like many ants bringing down an antelope by working together. </p>
<p>The systems that are used to carry out the attack might be home computers connected to the internet that are being used without the knowledge or consent of their owners. </p>
<p>This can happen when a user clicks on a link contained in an unsolicited email that appears to be from a genuine party that the user trusts. Such email can be very sophisticated and appear realistic, so it is easy to be tricked. </p>
<p>The link then downloads software that allows a third party to initiate a DoS attack remotely, using the unfortunate user’s computer. When the third party has enough computers under their control (known as “zombies”), they can launch a DDoS attack from afar. </p>
<p>DoS attacks were once solely the realm of experienced hackers with detailed knowledge of the inner workings of the connected computer systems. Recently, the resources needed to perform a DoS attack have been made readily available on the internet, so people with little knowledge of the technicalities could perform an attack. Such attacks are now available anonymously as a “service”, much as many businesses use cloud services for computing power or data storage. </p>
<p>Therefore, this capability is available to a range of potential attackers, from lone-wolf disgruntled individuals, to activists, to interest groups and even nation states.</p>
<p>Websites are attacked every day. However, cyber security professionals already use a range of techniques to prevent or minimise such attacks.</p>
<p>One such is geoblocking, which prevents traffic from overseas from reaching the server. And it was the geoblocking system that apparently failed last night, allowing the DDoS to hit home.</p>
<h2>Was it a DDoS?</h2>
<p>The census servers were not actually hosted by the ABS but by <a href="http://www.itnews.com.au/news/ibm-wins-96m-to-host-ecensus-in-2016-397613">IBM</a>, a company with extensive experience of running server networks.</p>
<p>The ABS also spent around A$470,000 load-testing its census servers in anticipation of census night. It claimed to have tested the system to <a href="http://www.news.com.au/finance/census-australia-2016-will-first-digital-census-repeat-the-click-frenzy-crash/news-story/e8c098b8e09706452583f8fae163f7f2#itm=newscomau%7Cfinance%7Cright-now-in-%7C1%7CCensus%20Australia%202016%3A%20Online%20system%20won%E2%80%99t%20crash%2C%20ABS%20says%7Cstory%7CThe%20answer%20to%20your%20Census%20panic&itmt=1470704784819">150%</a> of the expected load, saying that it could handle <a href="https://twitter.com/ABSCensus/status/755588601656725505">1 million form submissions per hour</a> – twice what the ABS expected it would need.</p>
<p>However, that might have underestimated the kind of load the servers should have expected. </p>
<p>Consider that there were <a href="http://www.abs.gov.au/ausstats/abs@.nsf/mf/8153.0/">12.9 million internet subscribers</a> in Australia at the end of 2015 (according to ABS figures, no less). </p>
<p>If each of these represents a household (a reasonable assumption, given that 99.3% of internet connections are broadband) and 2 million of these households accessed the census system during the day, this leaves a potential 10.9 million households attempting to reach the census servers in the evening.</p>
<p>If only half of those households actually attempted to fill out their census form last night, that still would have exceeded the ABS’s anticipated submission rate.</p>
<p>There is also the issue of how it conducted its load-testing, and whether it worked around average numbers per hour or considered <a href="https://theconversation.com/drowning-by-averages-did-the-abs-miscalculate-the-census-load-63752">peaks in activity</a>.</p>
<p>While the ABS may have attempted to anticipate the traffic on census night, there are indications that it didn’t consider all of the possible bottlenecks. Security journalist Patrick Gray also <a href="https://mobile.twitter.com/riskybusiness/status/763189895292555264">quotes a security professional’s analysis</a> of some of these bottlenecks.</p>
<p>There is also no evidence – besides the claims of the ABS and Minister McCormack – that the census servers suffered a DDoS. One website that tracks DDoS attacks globally showed <a href="http://www.digitalattackmap.com/#anim=1&color=0&country=ALL&list=0&time=17022&view=map">no unusual activity</a> in Australia around the time of the census, although such websites are not 100% accurate.</p>
<p>So while it’s possible that the census servers did suffer a DDoS attack, the evidence that it actually happened is inconclusive. </p>
<p>However, if the servers were already struggling under the load caused by Australians filling out their census forms, then even a weak DDoS could have been sufficient to tip it over the edge.</p>
<p>This leaves us with three possible scenarios: </p>
<p>1) a DDoS attack caused the problem;</p>
<p>2) too many users overloaded the system; or</p>
<p>3) a combination of both.</p>
<p>Perhaps we should apply <a href="https://en.wikipedia.org/wiki/Occam%27s_razor">Occam’s razor</a> and look for the simplest explanation. This would suggest that if it’s probable the Census servers simply failed under the weight of their task, then that’s the most likely explanation, rather than a deliberate DDoS attack.</p><img src="https://counter.theconversation.com/content/63755/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Mike Johnstone received funding from the European Union under the Framework Programme 7 grant scheme. </span></em></p>The evidence the Census servers suffered a DDoS attack is weak. A simpler explanation is that they buckled under load of Australians filling out their Census forms as asked.Mike Johnstone, Security Researcher, Senior Lecturer in Software Engineering, Edith Cowan UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/637522016-08-10T02:39:54Z2016-08-10T02:39:54ZDrowning by averages: did the ABS miscalculate the Census load?<figure><img src="https://images.theconversation.com/files/133600/original/image-20160810-9267-866pk4.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">If you only consider average depth, you could drown at the deepest point.</span> <span class="attribution"><span class="source">Shutterstock</span></span></figcaption></figure><p>There’s an old parable used in introductory statistics classes to illustrate how an average can be misleading when maximum values are of interest. The parable is of a person who drowns while walking across a river. </p>
<p>The person can’t swim but is not concerned because the average depth of the river is only 20cm. The problem is the <em>average</em> depth of the river is not useful information here; what is needed is information about the <em>maximum</em> depth so that they don’t end up over their head. </p>
<p>The river might well be only 20cm deep <em>on average</em> but several metres deep in the middle. As with river crossings, so too with various networks loads.</p>
<p>While the precise reason for the meltdown of the Australian Bureau of Statistics (<a href="http://www.abs.gov.au/">ABS</a>) online census system last night <a href="https://theconversation.com/census-website-cracks-after-malicious-attack-by-hackers-63734">remains unclear</a>, there is a lesson to be learned about load testing. </p>
<p>Prior to the census date of Tuesday, August 9, the ABS announced that there was no danger of the system being unable to handle the load on census night. Why? Because it had tested the system. </p>
<p>Or, rather, the ABS paid a <a href="http://eftm.com.au/2016/08/census-2016-the-10-million-online-census-what-went-wrong-30681">considerable sum of money</a> to an external party to test the system. Load testing is performed to some given specifications and here we find what could be a serious problem in the ABS testing procedure.</p>
<h2>Averages</h2>
<p>In order to reassure the public, who were growing nervous about the new online census, the ABS made the <a href="https://twitter.com/ABSCensus/status/755588601656725505">following statement</a>:</p>
<blockquote>
<p>The online Census form can handle 1,000,000 form submissions every hour. That’s twice the capacity we expect to need.</p>
</blockquote>
<p>From this statement, it seems the ABS load-tested for 1 million submissions per hour, while expecting 0.5 million per hour. But there are between 9 and 10 million households in Australia, and the ABS was expecting around 15 million census submissions in total, with <a href="http://www.news.com.au/finance/census-australia-2016-will-first-digital-census-repeat-the-click-frenzy-crash/news-story/e8c098b8e09706452583f8fae163f7f2">65% submitted online</a>. </p>
<p>Of course, not all these submissions would come on August 9, but most would. Moreover, the vast majority of these submissions would be expected to come in the peak-traffic time of early evening (between around 6pm and 10pm AEST).</p>
<p>The ABS’s expected load of 0.5 million submissions per hour only makes sense as an average load across a large part of the day. For example, if there were 0.5 million submissions evenly spread across 12 hours on August 9, that would give us 6 million submissions for this period. </p>
<p>But it is clear that load would not be spread evenly. And, to stress the obvious, it is the peak load that we’re interested in. Any reasonable estimate of the peak load for the early evening period is in the vicinity of several million per hour. </p>
<p>Worse still, there is no reason to expect the load to be evenly spread within this period. It is not beyond the realms of plausibility that 3 or 4 million people would be trying to log on to the system at, say, precisely 7.10pm. </p>
<p>Of course, all of this is consistent with an average load of 0.5 million submissions per hour for August 9. But from what the ABS has said, it is not clear that it tested for such peaks.</p>
<h2>ABS up to its neck</h2>
<p>So we should be careful not to take averages too seriously. As any statistician knows, an average is one (very crude) way of summarising data. </p>
<p>Other summaries include information about the most frequent data (mode), the middle of the data (median) and the spread of the data (variance). </p>
<p>To take the average too seriously in some settings, such as in the river-crossing parable and calculating network loads, is tantamount to confusing the <em>average</em> with the <em>peak</em> (i.e. to take the river to be uniformly 20cm deep or the census submission rate to be uniformly 0.5 million per hour).</p>
<p>It might seem uncharitable to suggest that such an elementary statistical mistake lies behind the ABS website problems last night – especially when talking about an organisation filled with statisticians. </p>
<p>The ABS’s story this morning is that it deliberately shut down the system to protect it from a number of distributed denial-of-service (DDoS) attacks. This is like the river crossing being hit by a flash flood at the crucial time. </p>
<p>But there is good reason to suspect that even without such DDoS attacks, the system was in serious danger of being overloaded. This means even a small rise in the water level, as it were, could have been enough to cause a catastrophic failure.</p>
<p>Our intrepid river crosser may in fact have been drowned by an unexpected flash flood. But given their failure to recognise the limitations of averages as statistical summaries, they were in trouble the moment they dipped their toe in the water.</p><img src="https://counter.theconversation.com/content/63752/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Mark Colyvan receives funding from the Australian Research Council.</span></em></p>Even without a DDoS attack, the 2016 Census may have failed due to the ABS making a rudimentary statistical error.Mark Colyvan, Professor of Philosophy, University of SydneyLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/637342016-08-09T23:00:57Z2016-08-09T23:00:57ZCensus website cracks after ‘malicious’ attack by hackers<figure><img src="https://images.theconversation.com/files/133588/original/image-20160809-9203-qsfbil.png?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">This is the screen that greeted many Australians on Census night, 9 August 2016.</span> <span class="attribution"><span class="source">ABS</span></span></figcaption></figure><p>Many Australians were unable to complete the Census on August 9 due to the <a href="http://census.abs.gov.au">Census website</a> failing.</p>
<p>Australian Bureau of Statistics (<a href="http://www.abs.gov.au/">ABS</a>) chief statistician has <a href="http://www.theage.com.au/national/census-website-attacked-by-hackers-abs-claims-20160809-gqouum.html">blamed a deliberate</a> “denial of service attack” for the failure. </p>
<blockquote>
<p>The first three [attacks] caused minor disruption, but more than two million forms were successfully submitted and safely stored.</p>
<p>After the fourth attack, which took place just after 7.30pm, the ABS took the precaution of closing down the system to ensure the integrity of the data.</p>
</blockquote>
<p>Like many government information systems, the Census site was <a href="http://www.itnews.com.au/news/ibm-wins-96m-to-host-ecensus-in-2016-397613">outsourced</a> to an external contractor: IBM. As well as writing the software required for the website, IBM was responsible for providing the computers that hosted it. </p>
<p>All of this is routine for IT projects, both government and commercial. And while reasonably large, the legitimate traffic generated by the Census is dwarfed by the traffic on websites like Google, Facebook and even the nonprofit Wikipedia.</p>
<h2>Denial-of-service attacks</h2>
<p><a href="https://www.us-cert.gov/ncas/tips/ST04-015">Denial-of-service attacks</a> are deliberate attempts to render a computing service unavailable.</p>
<p>Such an attack can be performed in many ways, including interfering with physical infrastructure. However, the most common denial-of-service technique used against publicly available websites is to overwhelm it with huge numbers of requests, overloading the servers and crowding out legitimate users.</p>
<p>Typically, the requests come from “<a href="https://theconversation.com/zombie-computers-cyber-security-phishing-what-you-need-to-know-1671">botnets</a>”, which are large groups of computers – often home PCs or other poorly-defended devices – that have been taken over by hackers and are then misused for “distributed” denial-of-service attacks" (DDoS attacks). DDoS attacks have been used by activist hackers, cybercriminals and even state-sponsored hackers.</p>
<p>While the controversy surrounding the <a href="https://theconversation.com/censusfail-the-abs-hasnt-convinced-the-public-their-privacy-is-protected-63702">privacy implications</a> of the 2016 Census may not have been anticipated by the ABS, a denial-of-service attack against the Census infrastructure was always possible and should have been anticipated – especially a DDoS launched by privacy activists.</p>
<p>There are a number of ways in which the dangers of a DDoS can be mitigated. It is unknown at this point what measures the ABS and its contractors took to prepare for the possibility.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/133589/original/image-20160809-18053-b9c3jk.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/133589/original/image-20160809-18053-b9c3jk.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/133589/original/image-20160809-18053-b9c3jk.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=125&fit=crop&dpr=1 600w, https://images.theconversation.com/files/133589/original/image-20160809-18053-b9c3jk.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=125&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/133589/original/image-20160809-18053-b9c3jk.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=125&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/133589/original/image-20160809-18053-b9c3jk.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=157&fit=crop&dpr=1 754w, https://images.theconversation.com/files/133589/original/image-20160809-18053-b9c3jk.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=157&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/133589/original/image-20160809-18053-b9c3jk.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=157&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption"></span>
<span class="attribution"><span class="source">ABS</span></span>
</figcaption>
</figure>
<h2>Poor capacity planning?</h2>
<p>From the perspective of the computers straining under the load, a DDoS attack is indistinguishable from a larger-than-expected number of users attempting to access the system at once.</p>
<p>The public statements of the ABS before Census night cast some doubt on whether the system was adequate to cope with even legitimate demand.</p>
<p>The head of the ABS, Chris Libreri, had earlier <a href="http://www.abc.net.au/news/2016-08-09/abs-website-inaccessible-on-census-night/7711652">claimed</a> that its systems had been tested to cope with the load of actual Census submissions:</p>
<blockquote>
<p>We have load tested it at 150% of the number of people we think are going to be on it on Tuesday for eight hours straight and it didn’t look like flinching.</p>
</blockquote>
<p>The ABS stated that its website was designed to handle <a href="http://www.smh.com.au/business/consumer-affairs/census-2016-chaos-for-australians-ahead-of-august-9-20160802-gqizw5.html">1,000,000 form submissions per hour</a>. However, around 18 million Australians live in the eastern states, which equates to about 7 million households.</p>
<p>If even 50% of those households attempted to submit their census during the evening hours from 7pm to 9pm, that would equate to 1.75 million form submissions per hour, 75% more than the reported capacity of the site.</p>
<p>Furthermore, it’s unlikely that traffic would be uniform within that time period. “Spikes” in traffic – perhaps after popular television shows ended – could potentially have overloaded the infrastructure even further.</p>
<p>It seems almost incredible that the team responsible for the contracting would collectively make such an error in their capacity estimates. </p>
<p>Regardless of the details of the attack, and whether other aspects of planning were inadequate, the Census failure will go down as another example of a failed “Big Bang deployment”. </p>
<p>A Big Bang occurs when an IT system is deployed on a large scale, all at once, and is required to work first time. The US <a href="http://www.nbcnews.com/storyline/obamacare-deadline/obamacare-website-fails-deadline-arrives-n67666">healthcare.gov website</a>, the <a href="http://www.smh.com.au/it-pro/government-it/worst-failure-of-public-administration-in-this-nation-payroll-system-20130806-hv1cw.html">Queensland Health payroll system</a> that failed so spectacularly in 2010, and even <a href="http://thenewdaily.com.au/sport/rio-olympics-2016/2016/08/08/rio-olympics-2016-channel-7/">Channel 7’s Olympics app</a> are examples of such all-at-once rollouts running into difficulty.</p>
<p>The lessons for proponents of online voting should be clear.</p><img src="https://counter.theconversation.com/content/63734/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Robert has donated to and volunteered for the Australian Greens.</span></em></p>Despite assuring Australians its systems were load tested and secure, the Census site went offline at a crucial time. Could the ABS have avoided such an embarrasing failure?Robert Merkel, Lecturer in Software Engineering, Monash UniversityLicensed as Creative Commons – attribution, no derivatives.