tag:theconversation.com,2011:/ca-fr/topics/data-6776/articlesData – La Conversation2024-03-11T13:07:21Ztag:theconversation.com,2011:article/2230252024-03-11T13:07:21Z2024-03-11T13:07:21ZTechnology to protect South Africa’s oceans: experts find that a data-driven monitoring system is paying off<figure><img src="https://images.theconversation.com/files/577893/original/file-20240226-24-qjmkpc.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">A fishing boat launching into South African waters at dawn.</span> <span class="attribution"><span class="source">Justin Klusener Photos</span></span></figcaption></figure><p>Nine years ago South Africa put in place an innovative information management system designed to monitor and protect its seas. The country is surrounded by the Atlantic and Indian oceans on its southern, eastern and western borders. </p>
<p>The oceans are an <a href="https://www.tandfonline.com/doi/abs/10.1080/19480881.2015.1066555">important source of income and employment</a>. The ocean economy <a href="https://www.dffe.gov.za/sites/default/files/docs/publications/oceans-economy-summary-progress-report-June2019.pdf">contributed about R110 billion</a> (around US$5.7 billion) to South Africa’s GDP in 2010. A 2019 government report <a href="https://www.dffe.gov.za/sites/default/files/docs/publications/oceans-economy-summary-progress-report-June2019.pdf">projected</a> that, by 2033, this would rise to R177 billion (US$9.2 billion), as well as creating just over one million jobs. The main sectors in ocean industries are maritime transport, fisheries and aquaculture, mineral resource exploitation and tourism. The potential for economic growth is also reflected in the country’s <a href="https://www.gov.za/sites/default/files/gcis_document/201706/saoceaneconomya.pdf">Operation Phakisa Oceans Economy plan</a>.</p>
<p>But, while the sheer extent of its maritime domain presents many opportunities, it also comes with governance challenges. It’s hard to monitor and plan for ocean-related economic development and conservation.</p>
<p>That’s where the National Oceans and Coastal Information Management System (<a href="https://ocims.environment.gov.za/About.html">OCIMS</a>) comes in. It was conceptualised within the country’s Department of Forestry, Fisheries and the Environment in 2012 and officially launched in 2015 in partnership with the Council for Scientific and Industrial Research (CSIR). </p>
<p>While the system is tailored to South Africa’s national priorities, it was inspired by other mature ocean information systems around the world, such as those in <a href="https://imos.org.au/">Australia</a> and the <a href="https://coastwatch.noaa.gov/cwn/index.html">US</a>.</p>
<p>The system brings ocean observations made by various national agencies into one platform. The major users are also partners who contribute to the system by sharing data and expertise.</p>
<p>For example, data capture apps on the system are used to share measurements made on aquaculture farms and inform users on the potential risk of <a href="https://oceanservice.noaa.gov/facts/redtide.html">red tides</a> (a common name used for harmful algal bloom). Boat-based whale watching operators contribute their marine species sightings data towards biodiversity assessments. All this data can be analysed by scientists and their findings used to advise on policy options or compliance and enforcement actions.</p>
<p>In a <a href="https://doi.org/10.1016/j.jenvman.2024.120255">recent paper</a> we looked at how the system emerged and why it’s been important for the protection of the country’s oceans. We found that it was providing value for money: it helped mitigate environmental or security risks, resulting in significant cost savings for the public and private sectors. It also promoted dialogue across government departments, non-profit organisations and the private sector. This facilitates a coordinated approach to ocean governance.</p>
<p>The approach taken to establish the system could benefit other countries looking to build their own ocean and coastal system knowledge platforms.</p>
<h2>Data-driven</h2>
<p>As the COVID pandemic demonstrated, informed decisions cannot occur without access to data. Historical and operational data provides situational awareness, informs policy and supports long-term planning and management.</p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/how-african-countries-can-harness-the-huge-potential-of-their-oceans-77889">How African countries can harness the huge potential of their oceans</a>
</strong>
</em>
</p>
<hr>
<p>To this end, the Department of Forestry, Fisheries and the Environment, working with the <a href="https://www.saeon.ac.za/">South African Environmental Observations Network</a>, created the <a href="https://data.ocean.gov.za/">Marine Information Management System</a>. It’s an essential component of the overall OCIMS system. It preserves, discovers and disseminates long-term data. It is internationally accredited and bound by best international standards and practices. </p>
<p>The system also makes data more accessible by providing <a href="https://ica-abs.copernicus.org/articles/6/275/2023/ica-abs-6-275-2023.pdf">user-specific data capture applications</a>, complemented by data visualisation platforms such as webmaps and dashboards. </p>
<h2>Supporting decisions</h2>
<p>Another of the system’s aims is to provide tools for supporting decisions. Such tools can be used for coordination and response (for example, monitoring <a href="https://ica-abs.copernicus.org/articles/6/275/2023/ica-abs-6-275-2023.pdf">avian influenza</a>). They can also be used in compliance and enforcement initiatives, such as tracking vessels.</p>
<p>The Fisheries and Aquaculture tool, for instance, supports both the public and private sectors by providing warnings on potentially harmful algal blooms, a phenomenon that can threaten aquaculture farms or affect fish and lobster populations. It detects algal blooms through satellite observations; this satellite data is complemented by information from those in the field, combining to create an active, interactive decision-making tool.</p>
<p>Then there’s the Integrated Vessel Tracking tool. It monitors vessels’ movements and is used daily by the institutions mandated to enforce security at sea, such as intelligence services and the navy, to detect or intercept illegal activities at sea. <a href="https://issafrica.org/research/books-and-other-publications/south-africas-maritime-domain-awareness-a-capability-baseline-assessment">Researchers say</a> the tool has worked to prevent illegal fishing and marine pollution. It’s also been instrumental in the interception of drug-loaded vessels.</p>
<h2>Collaboration</h2>
<p>All of these successes have been made possible by secure, sustained funding by the South African government. That has instilled a sense of security in collaborators and partners; they provide invaluable co-funding, expertise and data, saving money and building resilience into the system.</p>
<p>Some of the system’s tools have been shared with other countries in the <a href="https://marcosio.org/">southern African</a> and Indian Ocean regions. </p>
<p>As the project’s visibility increases, new opportunities for collaborations are emerging. Government departments, non-profit organisations and the private sector are coming forward with offers to share data. The system is also being proposed for use by academic scientists in their proposals.</p>
<p>One of the main lessons emerging from our research, which may be of interest to other countries wanting to launch similar initiatives, is that it’s crucial to involve a system’s major users in development from the start. Formalised stakeholder interactions ensure that the system directly responds to major user needs. That makes it immediately relevant and useful.</p><img src="https://counter.theconversation.com/content/223025/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Marjolaine Krug works for the South African Department of Forestry, Fisheries and the Environment, Oceans and Coast Branch.
The OCIMS is funded through Operation Phakisa Marine Protection Services and Ocean Governance workstream and is a partnership between the Department of Forestry, Fisheries and the Environment, the Department of Science and Innovation, the Council for Scientific and Industrial Research, the South African Environmental Observation Network and the South African Weather Services. </span></em></p><p class="fine-print"><em><span>Ashley Naidoo was the Chief Director for the Oceans and Coasts Science Programs at the Department of Forestry Fisheries and the Environment until January 2024.</span></em></p><p class="fine-print"><em><span>Lauren Williams works for the Department of Forestry, Fisheries and the Environment (South Africa). She has been involved in the development of the Oceans and Coastal Information Management System (OCIMS) since its inception. </span></em></p>South Africa’s ocean information management system is helping to mitigate security and environmental risks.Marjolaine Krug, Senior Scientific Advisor, University of Cape TownLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2235092024-03-01T13:40:34Z2024-03-01T13:40:34ZThe ‘average’ revolutionized scientific research, but overreliance on it has led to discrimination and injury<figure><img src="https://images.theconversation.com/files/578352/original/file-20240227-22-rs4i9u.jpg?ixlib=rb-1.1.0&rect=0%2C0%2C5591%2C3722&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">The average can tell you a lot about a dataset, but not everything. </span> <span class="attribution"><a class="source" href="https://www.gettyimages.com/photos/bell-curve?assettype=image&alloweduse=availableforalluses&agreements=pa%3A174132&family=creative&phrase=bell%20curve&sort=best">marekuliasz/iStock via Getty Images Plus</a></span></figcaption></figure><p>When analyzing a set of data, one of the first steps many people take is to compute an average. You might compare your height against the average height of people where you live, or brag about your favorite baseball player’s batting average. But while the average can help you study a dataset, it has important limitations. </p>
<p>Uses of the average that ignore these limitations have led to serious issues, such as <a href="https://www.routledge.com/The-Disability-Studies-Reader/Davis/p/book/9781138930230">discrimination</a>, <a href="https://www.gao.gov/products/gao-23-105595">injury</a> and even life-threatening accidents. </p>
<p>For example, the U.S. Air Force used to design its planes for “the average man,” but abandoned the practice when pilots <a href="https://www.youtube.com/watch?v=4eBmyttcfU4&pp=ygURdG9kZCByb3NlIGF2ZXJhZ2U%3D">couldn’t control their aircraft</a>. The average has many uses, but it doesn’t tell you anything about the variability in a dataset.</p>
<p>I am a <a href="https://scholar.google.com/citations?user=zEYYuIcAAAAJ&hl=en">discipline-specific education researcher</a>, meaning I study how people learn, with a focus on engineering. My research includes study of how engineers use averages in their work.</p>
<h2>Using the average to summarize data</h2>
<p>The average has been around for a long time, with its use documented as early as the ninth or eighth century BCE. In an early instance, the Greek poet Homer <a href="https://www.penguinrandomhouse.com/books/292278/the-history-of-the-peloponnesian-war-by-thucydides-translated-by-rex-warner-introduction-and-notes-by-m-i-finley/">estimated the number of soldiers</a> on ships by taking an average.</p>
<p>Early astronomers wanted to predict future locations of stars. But to make these predictions, they first needed accurate measurements of the stars’ current positions. Multiple astronomers would take position measurements independently, but they often arrived at different values. Since a star has just one true position, these discrepancies were a problem.</p>
<p>Galileo in 1632 was the <a href="https://doi.org/10.1080/0025570X.2006.11953386">first to push for a systematic approach</a> to address these measurement differences. His analysis was the beginning of <a href="https://press.princeton.edu/books/paperback/9780691208428/the-rise-of-statistical-thinking-1820-1900">error theory</a>. Error theory helps scientists reduce uncertainty in their measurements.</p>
<h2>Error theory and the average</h2>
<p>Under error theory, researchers interpret a set of measurements as falling around a true value that is corrupted by error. In astronomy, a star has a true location, but early astronomers may have had unsteady hands, blurry telescope images and bad weather – all sources of error.</p>
<p>To deal with error, researchers often assume that measurements are unbiased. In statistics, this means they evenly distribute around a central value. Unbiased measurements still have error, but they can be combined to better estimate the true value.</p>
<p>Say three scientists have each taken three measurements. Viewed separately, their measurements may seem random, but when unbiased measurements are put together, they evenly distribute around a middle value: the average.</p>
<p>When measurements are unbiased, the average will tend to sit in the middle of all measurements. In fact, we can show mathematically that <a href="https://doi.org/10.1080/0025570X.2006.11953386">the average is closest</a> to all possible measurements. For this reason, the average is an excellent tool for dealing with measurement errors.</p>
<h2>Statistical thinking</h2>
<p>Error theory was, in its time, considered revolutionary. Other scientists admired the precision of astronomy and sought to bring the same approach to their disciplines. The 19th century scientist Adolphe Quetelet applied ideas from error theory to study humans and <a href="https://press.princeton.edu/books/paperback/9780691208428/the-rise-of-statistical-thinking-1820-1900">introduced the idea</a> of taking averages of human heights and weights.</p>
<p>The average helps make comparisons across groups. For instance, taking averages from a dataset of male and female heights can show that the males in the dataset are taller – on average – than the females. However, the average does not tell us everything. In the same dataset, we could likely find individual females who are taller than individual males.</p>
<p>So, you can’t consider only the average. You should also consider the spread of values by thinking statistically. <a href="https://doi.org/10.1111/j.1751-5823.1999.tb00442.x">Statistical thinking</a> is defined as thinking carefully about variation – or the tendency of measured values to be different.</p>
<p>For example, different astronomers taking measurements of the same star and recording different positions is one example of variation. The astronomers had to think carefully about where their variation came from. Since a star has one true position, they could safely assume their variation was due to error.</p>
<p>Taking the average of measurements makes sense when variation comes from sources of error. But researchers have to be careful when interpreting the average when there is real variation. For instance, in the height example, individual females can be taller than individual males, even if men are taller on average. Focusing on the average alone <a href="https://doi.org/10.1080/26939169.2024.2308119">neglects variation</a>, which has caused serious issues.</p>
<p>Quetelet did not just take the practice of computing averages from error theory. He also took the assumption of a single true value. He elevated an ideal of “the average man” and suggested that <a href="https://press.princeton.edu/books/paperback/9780691208428/the-rise-of-statistical-thinking-1820-1900">human variability was fundamentally error</a> – that is, not ideal. To Quetelet, there’s something wrong with you if you’re not exactly average height.</p>
<p>Researchers who study <a href="https://www.routledge.com/The-Disability-Studies-Reader/Davis/p/book/9781138930230">social norms</a> note that Quetelet’s ideas about “the average man” contributed the modern meaning of the word “normal” – normal height, as well as normal behavior.</p>
<p>These ideas have been used by some, such as <a href="https://theconversation.com/francis-galton-pioneered-scientific-advances-in-many-fields-but-also-founded-the-racist-pseudoscience-of-eugenics-144465">early statisticians</a>, to divide populations in two: people who are in some way superior and those who are inferior. </p>
<p>For instance, the <a href="https://www.genome.gov/about-genomics/fact-sheets/Eugenics-and-Scientific-Racism">eugenics movement</a> – a despicable effort to prevent “inferior” people from having children – <a href="https://www.routledge.com/The-Disability-Studies-Reader/Davis/p/book/9781138930230">traces its thinking</a> to these ideas about “normal” people.</p>
<p>While Quetelet’s idea of variation as error <a href="https://doi.org/10.1080/15017410600608491">supports practices of discrimination</a>, Quetelet-like uses of the average also have direct connections to modern engineering failures.</p>
<h2>Failures of the average</h2>
<p>In the 1950s, the U.S. Air Force designed its aircraft for “the average man.” It assumed that a plane designed for an average height, average arm length and the average along several other key dimensions <a href="https://www.youtube.com/watch?v=4eBmyttcfU4&pp=ygURdG9kZCByb3NlIGF2ZXJhZ2U%3D">would work for most pilots</a>.</p>
<p>This decision contributed to as many as <a href="http://www.toddrose.com/endofaverage">17 pilots crashing in a single day</a>. While “the average man” could operate the aircraft perfectly, real variation got in the way. A shorter pilot would have trouble seeing, while a pilot with longer arms and legs would have to squish themselves to fit. </p>
<p>While the Air Force assumed most of its pilots would be close to average along all key dimensions, it found that out of 4,063 pilots, <a href="https://books.google.com/books/about/The_Average_Man.html?id=NxmdHAAACAAJ">zero were average</a>.</p>
<p>The Air Force solved the problem by designing for variation – it designed adjustable seats to account for the real variation among pilots.</p>
<p>While adjustable seats might seem obvious now, this “average man” thinking still causes problems today. In the U.S., women experience <a href="https://doi.org/10.2105/AJPH.2011.300275">about 50% higher odds of severe injury</a> in automobile accidents.</p>
<p>The Government Accountability Office blames this disparity on crash-test practices, where female passengers are crudely represented using a <a href="https://www.gao.gov/products/gao-23-105595">scaled version of a male dummy</a>, much like the Air Force’s “average man.” The first female crash-test dummy <a href="https://www.npr.org/2022/11/01/1133375223/the-first-female-crash-test-dummy-has-only-now-arrived">was introduced in 2022</a> and has yet to be adopted in the U.S.</p>
<p>The average is useful, but it has limitations. For estimating true values or making comparisons across groups, the average is powerful. However, for individuals who exhibit real variability, the average simply doesn’t mean that much.</p><img src="https://counter.theconversation.com/content/223509/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Zachary del Rosario receives funding from the National Science Foundation, and has worked with Citrine Informatics and Toyota Research Institute.</span></em></p>The average might come in handy for certain data analyses, but is any one person really ‘average’?Zachary del Rosario, Assistant Professor of Engineering, Olin College of EngineeringLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2241512024-02-27T19:06:16Z2024-02-27T19:06:16ZWhere does lightning strike? New maps pinpoint 36.8 million yearly ground strike points in unprecedented detail<figure><img src="https://images.theconversation.com/files/578115/original/file-20240226-30-8qy4my.jpg?ixlib=rb-1.1.0&rect=0%2C18%2C5559%2C3511&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Lightning strikes near St. George, Utah.</span> <span class="attribution"><a class="source" href="https://www.gettyimages.com/detail/photo/curtain-of-lightning-over-city-royalty-free-image/186538582">jerbarber/iStock/Getty Images Plus</a></span></figcaption></figure><p>It’s been a warm day, maybe even a little humid, and the tall clouds in the distance remind you of cauliflower. You hear a sharp crack, like the sound of a batter hitting a home run, or a low rumble reminiscent of a truck driving down the highway. A distant thunderstorm, alive with lightning, is making itself known.</p>
<p>Lightning flashes in thunderstorms <a href="https://indd.adobe.com/view/ddf9619e-36e0-46b4-981d-3458b2532b98">at least 60 times per second</a> somewhere around the planet, sometimes even <a href="https://www.vaisala.com/en/blog/2023-03/revising-record-record-lightning-north-pole">near the North Pole</a>. </p>
<p>Each giant spark of electricity travels through the atmosphere at 200,000 miles per hour. It is hotter than the surface of the sun and delivers thousands of times more electricity than the power outlet that charges your smartphone. That’s why lightning is so dangerous.</p>
<p><a href="https://doi.org/10.1175/WCAS-D-15-0032.1">Lightning kills or injures about 250,000 people</a> around the world every year, most frequently in developing countries, where many people work outside without lightning-safe shelters nearby. In the United States, <a href="http://lightningsafetycouncil.org/LSC-LightningFatalities.html">an average of 28 people were killed by lightning every year between 2006 and 2023</a>. Each year, insurance pays about <a href="https://www.iii.org/fact-statistic/facts-statistics-lightning">US$1 billion</a> in claims for lightning damage, and around <a href="https://www.nifc.gov/fire-information/statistics/lightning-caused">4 million acres of land</a> burn in lightning-caused wildfires.</p>
<p><iframe id="4FALI" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/4FALI/2/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<p>Yet, estimates of U.S. lightning strikes have varied widely, from about <a href="https://www.noaa.gov/news/new-lightning-tool-tells-striking-story">25 million a year</a>, a number meteorologists have cited since the 1990s, to <a href="https://www.cdc.gov/disasters/lightning/victimdata.html">40 million a year</a>, reported by the Centers for Disease Control and Prevention. That complicates lightning safety and protection efforts.</p>
<p>I’m a meteorologist whose research focuses on <a href="https://experts.news.wisc.edu/experts/chris-vagasky">understanding lightning behavior</a>. In a new study, my colleagues and I used six years of data from a national lightning detection network that we believe has become precise enough to offer a more accurate <a href="https://doi.org/10.1175/BAMS-D-22-0241.1">picture of lightning strikes across the U.S.</a> That knowledge is essential for improving forecasts and damage prevention.</p>
<h2>How much lightning strikes the US</h2>
<p>To get a clearer picture of how often lightning strikes, it helps to define what a lightning strike is. </p>
<p>Imagine looking out a window at a thunderstorm with cloud-to-ground lightning nearby. The lightning appears to flicker. </p>
<p>A lightning flash is all the cloud-to-ground lightning that occurs within 1 second and a 6-mile radius. Each flicker is a lightning stroke. Each stroke can hit one or more ground strike points, and there can be <a href="https://twitter.com/BBuffingtonNews/status/1126701479232823296?s=20">multiple strokes in the same channel</a>.</p>
<p>Lightning is a large electrical discharge trying to dissipate the electricity in a cloud, so if there is a lot of electricity built up, there can be a lot of lightning to get rid of it all.</p>
<p>Over six years of data from the <a href="https://doi.org/10.1175/JTECH-D-19-0215.1">National Lightning Detection Network</a>, we found that <a href="https://doi.org/10.1175/BAMS-D-22-0241.1">the U.S. averages</a> 23.4 million flashes, 55.5 million strokes and 36.8 million ground strike points each year. </p>
<h2>Where lightning strikes most often</h2>
<p>The basic ingredients for thunderstorms are warm and moist air near the ground with cooler, drier air above it and a way to lift the warm moist air. Anywhere those ingredients are present, lightning can occur. </p>
<p>This happens most frequently near the Gulf Coast, where the sea breeze helps trigger thunderstorms most days in the summer. Florida in particular is a hot spot for cloud-to-ground lightning strikes. The Miami-Fort Lauderdale area alone had over 120,000 lightning strokes in 2023.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/578331/original/file-20240227-22-uf79br.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="A map shows the most activity in the Gulf Coast states, lessening moving north and westward." src="https://images.theconversation.com/files/578331/original/file-20240227-22-uf79br.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/578331/original/file-20240227-22-uf79br.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=393&fit=crop&dpr=1 600w, https://images.theconversation.com/files/578331/original/file-20240227-22-uf79br.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=393&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/578331/original/file-20240227-22-uf79br.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=393&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/578331/original/file-20240227-22-uf79br.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=494&fit=crop&dpr=1 754w, https://images.theconversation.com/files/578331/original/file-20240227-22-uf79br.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=494&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/578331/original/file-20240227-22-uf79br.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=494&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Frequency of lightning ground strikes per year, averaged over six years, shows the most activity along the Gulf Coast.</span>
<span class="attribution"><a class="source" href="https://doi.org/10.1175/BAMS-D-22-0241.1">Vagasky, et al, 2024</a></span>
</figcaption>
</figure>
<p>The Central and Southern U.S. aren’t quite as lightning prone, but they tend to have more thunderstorms and lightning strikes than the North and West of the country, though lightning in the West can be especially destructive <a href="https://www.nifc.gov/fire-information/statistics/lightning-caused">when it sparks wildfires</a>.</p>
<p>The cool waters of the Pacific Ocean, meanwhile, tend to mean few thunderstorms along the West Coast.</p>
<h2>Counting lightning strikes</h2>
<p>To be able to count how much lightning is hitting the ground and where it is doing so, you have to be able to detect it. Luckily, cloud-to-ground lightning is fairly easy to detect – in fact, you may have done it.</p>
<p>When lightning flashes, it acts like a giant radio antenna that sends electromagnetic waves – radio waves – around the world at the speed of light. If you have an AM radio station on during a thunderstorm, you may hear a lot of static.</p>
<p>The <a href="https://doi.org/10.1175/JTECH-D-19-0215.1">National Lightning Detection Network</a> uses strategically placed antennas to listen for these radio waves produced by lightning. It’s now able to locate at least 97% of the cloud-to-ground lightning that occurs across the U.S.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/578369/original/file-20240227-26-hdvklo.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="A map shows the most activity in the Gulf Coast states, lessening over the Great Plains while still high in the mountains." src="https://images.theconversation.com/files/578369/original/file-20240227-26-hdvklo.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/578369/original/file-20240227-26-hdvklo.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=393&fit=crop&dpr=1 600w, https://images.theconversation.com/files/578369/original/file-20240227-26-hdvklo.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=393&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/578369/original/file-20240227-26-hdvklo.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=393&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/578369/original/file-20240227-26-hdvklo.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=494&fit=crop&dpr=1 754w, https://images.theconversation.com/files/578369/original/file-20240227-26-hdvklo.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=494&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/578369/original/file-20240227-26-hdvklo.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=494&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">The average number of cloud-to-ground lightning strike points per flash across the United States between 2017 and 2022.</span>
<span class="attribution"><a class="source" href="https://doi.org/10.1175/BAMS-D-22-0241.1">Vagasky, et al, 2024</a></span>
</figcaption>
</figure>
<p>The number of lightning strikes varies year to year depending on the prevailing weather patterns during the spring and summer months, when lightning is most common. There isn’t enough accurate U.S. data yet to say whether there is a trend toward more or less lightning. However, changes in lightning frequency and location can be an indicator of climate change affecting storms and precipitation, which is why the World Meteorological Organization designated lightning as an “<a href="https://gcos.wmo.int/en/essential-climate-variables/about">essential climate variable</a>.”</p>
<h2>Better data can boost safety</h2>
<p>Meteorologists and emergency management teams can use this new data and our analysis to better understand how lightning typically affects their regions. That can help them better forecast risks and prepare the public for thunderstorm hazards. Engineers are also using these results to create better <a href="https://webstore.iec.ch/preview/info_iec62305-3%7Bed2.0%7Den.pdf">lightning protection standards</a> to keep people and property safe.</p>
<p>Lightning strikes are still unpredictable. So, to stay safe, remember: When thunder roars, go indoors.</p><img src="https://counter.theconversation.com/content/224151/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Chris Vagasky previously worked for Vaisala, owner-operator of the National Lightning Detection Network</span></em></p>A new study shows how often lightning strikes and how it behaves, often hitting the ground with multiple strikes from the same flash.Chris Vagasky, Meteorologist, University of Wisconsin-MadisonLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2195572024-02-23T13:48:31Z2024-02-23T13:48:31ZHow governments handle data matters for inclusion<figure><img src="https://images.theconversation.com/files/576859/original/file-20240220-30-3ger1q.jpg?ixlib=rb-1.1.0&rect=1785%2C0%2C3779%2C3704&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Do you feel included in how government handles and uses data?</span> <span class="attribution"><a class="source" href="https://newsroom.ap.org/detail/Biden/1577ded6699c49ea835bbf2ee5fbb3a7/photo">AP Photo/Patrick Semansky</a></span></figcaption></figure><p>Governments increasingly rely on large amounts of data to provide services ranging from <a href="https://doi.org/10.1177/1461444820902682">mobility</a> and <a href="https://www.epa.gov/outdoor-air-quality-data/air-data-basic-information">air quality</a> to <a href="https://doi.org/10.1080/01442872.2020.1724928">child welfare</a> and <a href="https://doi.org/10.1177/1473225419883706">policing programs</a>. While governments have always relied on data, their increasing use of algorithms and <a href="https://www.oecd.org/gov/innovative-government/working-paper-hello-world-artificial-intelligence-and-its-use-in-the-public-sector.htm">artificial intelligence</a> has fundamentally changed the way they use data for public services.</p>
<p>These technologies have the potential to improve the effectiveness and efficiency of public services. But if data is not handled thoughtfully, it can lead to inequitable outcomes for different communities because data gathered by governments can <a href="https://doi.org/10.1002/9781119815075.ch46">mirror existing inequalities</a>. To minimize this effect, governments can make inclusion an element of their data practices. </p>
<p>To better understand how data practices affect inclusion, we – scholars of <a href="https://scholar.google.com/citations?hl=en&user=sRReVx0AAAAJ&view_op=list_works&sortby=pubdate">public affairs</a>, <a href="https://scholar.google.com/citations?hl=vi&user=d1PUVQgAAAAJ&view_op=list_works&sortby=pubdate">policy</a> and <a href="https://scholar.google.com/citations?hl=en&user=Uhk-JAcAAAAJ&view_op=list_works&sortby=pubdate">administration</a> – break down <a href="https://doi.org/10.1111/puar.13585">government data practices</a> into four activities: data collection, storage, analysis and use. </p>
<h2>Collection</h2>
<p>Governments collect data about all manner of subjects via surveys, registrations, social media and in <a href="https://www.trafficengland.com/">real time</a> via mobile devices such as sensors, cellphones and body cameras. These datasets provide opportunities to shape social inclusion and <a href="https://www.census.gov/about/what/data-equity.html">equity</a>. For example, open data can be used as a spotlight to <a href="https://doi.org/10.1111/cag.12608">expose</a> <a href="https://www.nytimes.com/interactive/2023/02/12/upshot/child-maternal-mortality-rich-poor.html">health disparities</a> or inequalities in <a href="https://www.nytimes.com/interactive/2023/11/06/business/economy/commuting-change-covid.html">commuting</a>. </p>
<p>At the same time, we found that poor-quality data can worsen inequalities. Data that is incomplete, outdated or inaccurate can result in the underrepresentation of vulnerable groups because they may not have access to the technology used to collect the data. Also, government data collection might lead to <a href="https://www.latimes.com/california/story/2022-08-17/lapd-adopts-new-rules-for-obtaining-using-t">oversurveillance</a> of vulnerable communities. Consequently, some people may <a href="https://doi.org/10.1177/0003122417725865">choose to avoid</a> contributing data to government institutions.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/576861/original/file-20240220-28-vgmlvt.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="A city map with numerous small red, orange and yellow squares" src="https://images.theconversation.com/files/576861/original/file-20240220-28-vgmlvt.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/576861/original/file-20240220-28-vgmlvt.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=498&fit=crop&dpr=1 600w, https://images.theconversation.com/files/576861/original/file-20240220-28-vgmlvt.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=498&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/576861/original/file-20240220-28-vgmlvt.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=498&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/576861/original/file-20240220-28-vgmlvt.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=625&fit=crop&dpr=1 754w, https://images.theconversation.com/files/576861/original/file-20240220-28-vgmlvt.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=625&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/576861/original/file-20240220-28-vgmlvt.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=625&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Predictive policing is an example of government use of data that researchers have found can be biased and inaccurate.</span>
<span class="attribution"><a class="source" href="https://commons.wikimedia.org/wiki/File:Criminaliteits_Anticipatie_Systeem.png">Arnout de Vries/Wikimedia</a></span>
</figcaption>
</figure>
<p>To foster inclusive practices, government practitioners could work with citizens to develop inclusive data collection protocols.</p>
<h2>Storage</h2>
<p>Data storage refers to where and how data is stored by the government, such as in databases or cloud data storage services. We found that government decisions about access to stored data and data ownership might lead to <a href="https://doi.org/10.1111/puar.13615">administrative exclusion</a>, meaning unintentionally restricting citizen access to benefits and services. For example, administrative registration errors in applications for services and the difficulty citizens experience when they attempt to correct errors in stored data can lead to differences in how governments treat them and even a loss of public services. </p>
<p>We also found that personal data might be stored with cloud vendors in data warehouses <a href="https://doi.org/10.1177/2053951720912775">outside the influence of the government organizations</a> that initially created and collected the data. While governments are typically required to follow rigorous data collection practices, data storage companies do not necessarily need to comply with the same standards. </p>
<p>To overcome this problem, governments can set transparency and accountability requirements for data storage that foster inclusion.</p>
<h2>Analysis</h2>
<p>One important way governments analyze data to extract information is by using algorithms. For example, <a href="https://doi.org/10.1177/1354856520933838">predictive policing</a> uses algorithms to predict where crime will occur.</p>
<p>A key question is who is conducting the analysis. Those who might be providing data, such as citizens or civil society organizations, are less likely to analyze the data. Citizens may not have the <a href="https://doi.org/10.1080/23251042.2016.1220849">skills, expertise or the tools</a> to do so. Often, external experts conduct the analysis, and they might be unaware of the historical context, culture and local conditions of the data. In that way, data may also construct and reinforce inequalities.</p>
<p>To foster inclusion, governments could diversify and increase the training of the teams who perform the analyses and write the algorithms so that they can interpret data within its larger historical and political context.</p>
<h2>Using the data</h2>
<p>Finally, governments are using the results of data analysis to inform public service provision. For example, data-driven visualizations, such as maps, might be used to <a href="https://nij.ojp.gov/topics/articles/crime-mapping-crime-forecasting-evolution-place-based-policing">make decisions about where to direct police officers</a>. However, this might also lead to <a href="https://doi.org/10.1177/0003122417725865">disproportionate surveillance</a> of different groups.</p>
<p>Another issue is “<a href="https://doi.org/10.1080/17579961.2021.1898299">function creep</a>.” Data might be collected for one purpose but is often eventually used for other purposes or by other government agencies, possibly leading to misuse of data and the reproduction of inequalities.</p>
<p><a href="https://doi.org/10.1002/asi.24639">Digital literacy programs</a> for both government professionals and the public can facilitate a better understanding of how data is visualized and used.</p>
<h2>Building inclusion into the process</h2>
<p>It is important to highlight that these activities – collection, storage, analysis and use – are linked. Inequalities in the early stages may eventually lead to inequitable outcomes in the form of policies, decisions and services. </p>
<p>Additionally, we found a conundrum: On the one hand, the invisibility of vulnerable groups in data collection can result in inequalities. Therefore, different groups should be included in the activities of the data process. On the other hand, this can also be problematic because digital footprints can lead to oversurveillance of the same groups.</p>
<p>Reconciling these conflicting concerns requires an <a href="https://dialnet.unirioja.es/servlet/articulo?codigo=7400442">ethical reflection</a>: pausing before embracing data and reflecting on its purpose, limitations and long-term implications for inclusion. </p>
<p>The four activities are a repeated rather than linear process in which governments, citizens and third parties embrace <a href="https://doi.org/10.1111/puar.13585">inclusive data strategies</a>. This means looking at what was created, including diverse voices and understanding the analysis, results and consequences of decisions. And it means consistently changing aspects of the process that do not foster inclusion.</p><img src="https://counter.theconversation.com/content/219557/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Suzanne J. Piotrowski has received funding from the National Science Foundation and the Open Government Partnership.</span></em></p><p class="fine-print"><em><span>Gregory Porumbescu has received external funding from the National Science Foundation and the New Jersey Office of the Secretary of Higher Education.</span></em></p><p class="fine-print"><em><span>Erna Ruijer does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>Governments can exclude certain groups of people in policies and services not only by the type of data they collect but also how they collect, store, analyze and use the data.Suzanne J. Piotrowski, Professor of Public Affairs and Administration, Rutgers University - NewarkErna Ruijer, Assistant Professor of Governance, Utrecht UniversityGregory Porumbescu, Associate Professor of Public Affairs and Administration, Rutgers University - NewarkLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2172462024-02-15T21:27:41Z2024-02-15T21:27:41ZTo protect user privacy online, governments need to reconsider their use of opt-in policies<figure><img src="https://images.theconversation.com/files/570987/original/file-20240123-27-22n2mg.jpg?ixlib=rb-1.1.0&rect=654%2C333%2C6928%2C4964&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Almost every website — both for-profit and not-for-profit — commodifies user data.</span> <span class="attribution"><span class="source">(Shutterstock)</span></span></figcaption></figure><p>Internet users — almost all of us — are growing used to seeing requests for consent to gather our information: “Do you accept cookies from this website?” Most of us just click “yes” and continue browsing, rather than bothering with convoluted settings and choices we don’t quite understand. </p>
<p>Consumers are not too happy <a href="https://www.wired.com/story/what-do-cookie-preferences-pop-ups-mean/">with these requests</a> and some even <a href="https://www.wired.com/story/avoid-cookie-popups-gdpr/">look for ways to avoid them</a>. These pop-ups are in response to recent data protection and privacy regulations, such as the European Union’s <a href="https://gdpr.eu/">General Data Protection Regulation</a> and <a href="https://oag.ca.gov/privacy/ccpa">California’s Consumer Privacy Act</a>. </p>
<p>Other jurisdictions are looking to implement their own sets of regulations, including Canada, which is in the process of reviewing and modernizing the <a href="https://www.justice.gc.ca/eng/csj-sjc/pa-lprp/modern.html">Privacy Act</a>.</p>
<p>Such regulations are intended to limit the collection of data on users and users’ exposure to third parties, but our analysis suggests these regulations may not be as effective as intended. Our research has found they actually increase the use of third parties that access user data and decrease competition to the detriment of consumers.</p>
<h2>Commodification of user data</h2>
<p>Almost every website — both for-profit and not-for-profit — commodifies user data. Within the first three seconds of opening a web page, <a href="https://doi.org/10.1287/isre.2022.1178">over 80 third parties on average have accessed your information</a>.</p>
<p>The usage of user data by third parties can be helpful, as it is an easy way for companies to earn money and it can easily connect consumers to any resources they are looking for.</p>
<p>But third parties can also pose serious privacy threats to consumers, which is why privacy legislation is needed. Privacy threats can result in financial harm to users and society at large. For example, discrimination can be based on any detectable characteristic, including psychographic profiles, age, race, gender, religious affiliation and others. </p>
<p>Society at large can be harmed by coordinated attempts to manipulate voters, as was the case with the <a href="https://www.nytimes.com/2018/04/04/us/politics/cambridge-analytica-scandal-fallout.html">Cambridge Analytica scandal</a>.</p>
<p>Moreover, the strategic reaction of the websites to regulation is often overlooked. There is a cat and mouse game in reaction to regulation — they are not a matter of simple compliance. </p>
<p>If a regulation says a website has to do X, then a website will react to that limitation and do Y while also doing X. Strategic reactions are not necessarily to avoid compliance, but rather to maximize profit in response to new regulatory requirements.</p>
<h2>The impact of privacy policies</h2>
<p>Our research group, consisting of scholars including Ram Gopal from the University of Warwick, Niam Yaraghi from the University of Miami, and Hooman Hidaji, Sule Kutlu and Ray Patterson from the University of Calgary, have spent years studying website privacy and revenue management.</p>
<p>Previously, <a href="https://doi.org/10.25300/MISQ/2018/13839">we analyzed</a> <a href="https://doi.org/10.1093/jamiaopen/ooab100">the privacy implications</a> <a href="https://doi.org/10.1287/ijoc.2022.1266">of website monetization strategies</a> and the <a href="https://doi.org/10.1145/3382188">prediction of website trustworthiness</a> by <a href="https://doi.org/10.1016/j.dss.2021.113698">observing their third-party usage</a>. Recently, our focus has shifted to studying the impact of data regulation on consumers and websites to understand the impact of new privacy policies.</p>
<p>In our recent study, published in <a href="https://doi.org/10.1287/isre.2022.1178"><em>Information Systems Research</em></a>, we studied the effects of government intervention to protect consumer privacy online. We collected third-party utilization of the most popular 100,000 websites globally when California’s Consumer Privacy Act (CCPA) went into effect.</p>
<p>Comparing jurisdictions with and without opt-in policies, we found that the implementation of opt-in policies had an unintended effect on the use of third parties: there was a significant increase in the number of third parties when accessing websites from California after CCPA went into effect.</p>
<p>We also found that, in markets where some users had relatively low privacy concerns, opt-in laws had the unintended consequence of increasing the number of third parties, thereby increasing the privacy exposure of users. </p>
<h2>Learning from past mistakes</h2>
<p>Our findings have important implications for policymakers involved in data protection and privacy regulation. In Canada, where privacy regulation is not yet finalized, there is an opportunity to learn from the mistakes of other regulators. </p>
<p>As our research has found, opt-in policies are counterproductive in addressing third-party data-sharing concerns and can harm competition. Instead, we recommend using a mix of policies that are used in a more precise manner, rather than the currently preferred one-size-fits-all policies. </p>
<p>More precisely targeted mechanisms, such as limited consent requirements and subsidizing websites in particular sectors or industries, motivate competing websites to improve their third-party data sharing. Website subsidization acts like a precise tool, allowing policymakers to impact specific target markets. </p>
<p>Opt-in policies, on the other hand, are more comparable to a sledgehammer that uniformly affects all market segments. Rather than globally implementing legislation, we advocate for a combination of policies and local subsidies that are better suited to an industry’s specific needs.</p><img src="https://counter.theconversation.com/content/217246/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Raymond A. Patterson received financial support from the Social Sciences and Humanities Research Council of Canada and the Haskayne School of Business at the University of Calgary. </span></em></p><p class="fine-print"><em><span>Hooman Hidaji receives funding from the Social Sciences and Humanities Research Council of Canada.</span></em></p><p class="fine-print"><em><span>Niam Yaraghi is a non-resident senior fellow at the Brookings Institution.</span></em></p><p class="fine-print"><em><span>Ram Gopal receives funding from The Gillmore Centre for Financial Technology at the Warwick Business School. </span></em></p><p class="fine-print"><em><span>Sule Nur Kutlu does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>New research shows that opt-in policies may not be as effective as intended when it comes to data protection and privacy regulations.Raymond A. Patterson, Professor, Area Chair, Business Technology Management, Haskayne School of Business, University of CalgaryHooman Hidaji, Assistant Professor of Business Technology Management, University of CalgaryNiam Yaraghi, Assistant Professor of Business Technology, Miami Herbert Business School, University of MiamiRam Gopal, Professor of Information Systems Management, Warwick Business School, University of WarwickSule Nur Kutlu, Assistant Professor, Haskayne School of Business, University of CalgaryLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2228112024-02-13T17:20:38Z2024-02-13T17:20:38ZArtificial intelligence needs to be trained on culturally diverse datasets to avoid bias<figure><img src="https://images.theconversation.com/files/574857/original/file-20240212-30-3cdpyu.jpg?ixlib=rb-1.1.0&rect=0%2C45%2C3840%2C2109&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">There is a growing need to address diversity in the datasets used to train artificial intelligence.</span> <span class="attribution"><span class="source">(Shutterstock)</span></span></figcaption></figure><p>Large language models (LLMs) are deep learning artificial intelligence programs, like OpenAI’s ChatGPT. The capabilities of LLMs have developed into quite a wide range, from <a href="https://www.techradar.com/news/i-had-chatgpt-write-my-college-essay-and-now-im-ready-to-go-back-to-school-and-do-nothing">writing fluent essays</a>, through coding to creative writing. <a href="https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01/">Millions of people worldwide use LLMs</a>, and it would not be an exaggeration to say these technologies are transforming work, education and society.</p>
<p>LLMs are trained by reading massive amounts of texts and learning to recognize and mimic patterns in the data. This allows them to generate coherent and human-like text on virtually any topic. </p>
<p>Because the internet is still predominantly English — <a href="https://www.statista.com/statistics/262946/most-common-languages-on-the-internet/">59 per cent of all websites were in English as of January 2023</a> — LLMs are primarily trained on English text. In addition, the vast majority of the English text online comes from users based in the United States, home to <a href="https://www.census.gov/library/publications/2022/acs/acs-50.html">300 million English speakers</a>. </p>
<p>Learning about the world from English texts written by U.S.-based web users, LLMs speak <a href="https://www.pbs.org/speak/seatosea/standardamerican/">Standard American English</a> and have a narrow western, North American, or even U.S.-centric, lens.</p>
<h2>Model bias</h2>
<p>In 2023, ChatGPT, upon learning about a couple dining in a restaurant in Madrid and tipping four per cent, <a href="https://chat.openai.com/share/2969f35f-8ee2-4bc0-a8a7-c44a7078037e">suggested they were frugal, on a tight budget or didn’t like the service</a>. By default, ChatGPT followed the North American standard of a 15 to 25 per cent tip, <a href="https://www.tripsavvy.com/should-you-tip-in-spain-1644349">ignoring the Spanish norm not to tip</a>. </p>
<p>As of early 2024, ChatGPT correctly cites cultural differences when prompted to judge the appropriateness of a tip. It’s unclear if this capability emerged from training a newer version of the model on more data — after all, the web is full of tipping guides in English — or whether OpenAI patched this particular behaviour.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/574868/original/file-20240212-29-mz6yzd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="a screen showing text about ChatGPT Optimizing Language Models for Dialogue" src="https://images.theconversation.com/files/574868/original/file-20240212-29-mz6yzd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/574868/original/file-20240212-29-mz6yzd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/574868/original/file-20240212-29-mz6yzd.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/574868/original/file-20240212-29-mz6yzd.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/574868/original/file-20240212-29-mz6yzd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/574868/original/file-20240212-29-mz6yzd.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/574868/original/file-20240212-29-mz6yzd.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Using data from English-language websites, which are predominantly U.S.-based, informs how LLMs respond to prompts.</span>
<span class="attribution"><span class="source">(Unsplash/Jonathen Kemper)</span></span>
</figcaption>
</figure>
<p>Still, other examples remain that uncover ChatGPT’s implicit cultural assumptions. For example, prompted with a story about guests showing up for dinner at 8:30 p.m., it suggested <a href="https://chat.openai.com/share/3c8db9c7-7c37-4d45-80b2-a891c46fc4fd">reasons that the guests were late</a>, although the time of the invitation was not mentioned. Again, ChatGPT likely assumed they were invited for a standard North American 6 p.m. dinner.</p>
<p>In May 2023, researchers from the University of Copenhagen <a href="https://doi.org/10.18653/v1/2023.c3nlp-1.7">quantified this effect</a> by prompting LLMs with the <a href="https://www.hofstede-insights.com/country-comparison-tool">Hofstede Culture Survey</a>, which measures human values in different countries. Shortly after, researchers from <a href="https://llmglobalvalues.anthropic.com/">AI start-up company Anthropic</a> used the <a href="https://www.worldvaluessurvey.org/wvs.jsp">World Values Survey</a> to do the same. Both works concluded that LLMs exhibit strong alignment with American culture. </p>
<p>A similar phenomenon is encountered when asking <a href="https://openai.com/dall-e-3">DALL-E 3</a>, an image generation model trained on pairs of images and their captions, to generate an image of a breakfast. This model, which was trained on mainly images from Western countries, generated images of pancakes, bacon and eggs. </p>
<h2>Impacts of bias</h2>
<p>Culture plays a significant role in shaping our communication styles and worldviews. Just like <a href="https://erinmeyer.com/books/the-culture-map/">cross-cultural human interactions can lead to miscommunications</a>, users from diverse cultures that are interacting with conversational AI tools may feel misunderstood and experience them as less useful. </p>
<p>To be better understood by AI tools, users may adapt their communication styles in a manner similar to how people learned to “Americanize” their foreign accents in order to operate <a href="https://www.washingtonpost.com/graphics/2018/business/alexa-does-not-understand-your-accent/">personal assistants like Siri and Alexa</a>. </p>
<p>As more people rely on LLMs for editing writing, they are likely to <a href="https://theconversation.com/chatgpt-threatens-language-diversity-more-needs-to-be-done-to-protect-our-differences-in-the-age-of-ai-198878">unify how we write</a>. Over time, LLMs run the risk of erasing cultural differences.</p>
<h2>Decision-making and AI</h2>
<p>AI is already in use as the backbone of various applications that make decisions affecting people’s lives, such as <a href="https://www.reuters.com/legal/tutoring-firm-settles-us-agencys-first-bias-lawsuit-involving-ai-software-2023-08-10/">resume filtering</a>, <a href="https://www.open-communities.org/post/press-release-open-communities-reaches-accord-in-case-addressing-artificial-intelligence-communicat">rental applications</a> and <a href="https://www.theguardian.com/technology/2023/oct/23/uk-officials-use-ai-to-decide-on-issues-from-benefits-to-marriage-licences">social benefits applications</a>. </p>
<p>For years, <a href="https://www.penguinrandomhouse.com/books/241363/weapons-of-math-destruction-by-cathy-oneil/">AI researchers have been warning</a> that these models learn not only “good” statistical associations — such as considering experience as a desired property for a job candidate — but also “bad” statistical associations, such as considering <a href="https://www.reuters.com/article/idUSKCN1MK0AG/">women as less qualified for tech positions</a>. </p>
<p>As LLMs are increasingly used for automating such processes, one can imagine that the North American bias learned by these models can result in discrimination against people from diverse cultures. Lack of cultural awareness may lead to AI perpetuating stereotypes and reinforcing societal inequalities. </p>
<h2>LLMs for languages other than English</h2>
<p>Developing LLMs for languages other than English is an <a href="https://txt.cohere.com/aya-multilingual/">important effort</a>, and many such models exist. However, there are several reasons why this should be done in parallel to improving LLMs’ cultural awareness and sensitivity. </p>
<p>First, there is a huge population of English speakers outside of North America who are not represented by English LLMs. The same argument holds for other languages. A French language model would be representative of the culture in France more than the culture in other Francophone regions. </p>
<p>Training LLMs for regional dialects — which <a href="https://doi.org/10.1016/j.jue.2012.05.007">may capture finer-grained cultural differences</a> — is not a feasible solution either. The quality of LLMs is based on the amount of data available, and as such, their quality would be worse for dialects with little online data. </p>
<p>Second, many users whose native language is not English still choose to use English LLMs. Significant breakthroughs in language technologies tend to <a href="https://doi.org/10.18653/v1/2022.emnlp-main.351">start with English before they are applied to other languages</a>. Even then, many languages — such as Welsh, Swahili and Bengali — don’t have enough text online to train high quality models. </p>
<p>Due to either a lack of availability of LLMs in their native languages, or superior quality of the English LLMs, users from diverse countries and backgrounds may prefer to use English LLMs. </p>
<h2>Ways forward</h2>
<p>Our research group at the University of British Columbia is working on enhancing LLMs with culturally diverse knowledge. Together with graduate student <a href="https://meharbhatia.github.io/">Mehar Bhatia</a>, we <a href="https://doi.org/10.18653/v1/2023.emnlp-main.496">trained an AI model</a> on a <a href="https://doi.org/10.1145/3543507.3583535">collection of facts about traditions and concepts in diverse cultures</a>. </p>
<p>Before reading these facts, the AI suggested that a person eating a dutch baby (a type of German pancake) is “disgusting and mean,” and would feel guilty. After training, it said the person feels “full and satisfied.”</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/574866/original/file-20240212-21-lmr4xk.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="a pancake covered in berries" src="https://images.theconversation.com/files/574866/original/file-20240212-21-lmr4xk.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/574866/original/file-20240212-21-lmr4xk.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/574866/original/file-20240212-21-lmr4xk.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/574866/original/file-20240212-21-lmr4xk.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/574866/original/file-20240212-21-lmr4xk.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/574866/original/file-20240212-21-lmr4xk.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/574866/original/file-20240212-21-lmr4xk.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Teaching an AI that a dutch baby was a dish changed its response to learning that someone had consumed one.</span>
<span class="attribution"><span class="source">(Shutterstock)</span></span>
</figcaption>
</figure>
<p>We are currently collecting a large scale image captioning dataset with images from 60 cultures, which will help models learn, for instance, about types of breakfasts other than bacon and eggs. Our future research will go beyond teaching models about the existence of culturally diverse concepts to better understand how people interpret the world through the lens of their cultures.</p>
<p>With AI tools becoming increasingly ubiquitous in society, it is imperative that they go beyond the dominating western and North American perspectives. Businesses and organizations throughout many sectors of the economy are adopting AI to automate manual processes and make better evidence-informed decisions using data. Making such tools more inclusive is crucial for the diverse population of Canada.</p><img src="https://counter.theconversation.com/content/222811/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Vered Shwartz does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>The use of large language models like ChatGPT is growing globally. These technologies are trained on datasets that recreate biases — as their use increases, their datasets must become more diverse.Vered Shwartz, Assistant Professor, Computer science, University of British ColumbiaLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2221432024-02-12T23:47:06Z2024-02-12T23:47:06ZWhich day of the week gets the most people to vote? We analysed thousands of international elections to find out<p>In the aftermath of elections, one of the issues usually discussed in the media is the amount of people who turned out to vote. This is known as “participation” or “voter turnout”. </p>
<p>Several factors, such as the weather, can affect turnout. For example, the Republican primaries in Iowa on January 15 were held in very cold temperatures (subzero wind chills and a blizzard). Commentators have identified the cold as a factor that <a href="https://www.nytimes.com/2024/01/15/us/politics/iowa-caucus-turnout-cold.html">negatively influenced</a> turnout, as many Republican voters decided to stay at home, even though Iowa is (almost) always cold in January. </p>
<p>The Republican primaries were held not only on a cold day, but on a working Monday. Yes, a Monday. This may not sound all that strange to the US public, who are used to voting on Tuesdays in their general elections, but it could for Australians who are used to voting on Saturdays. Australia is one of only a few countries that vote on Saturdays, along with Cyprus, Malta, Iceland, Latvia, Slovakia, Taiwan and New Zealand. </p>
<p>But, does it matter when we vote? Does it affect voter turnout? Do we know if more people vote during the weekend than, say, on a Tuesday? We analysed data from thousands of elections across the globe to find out.</p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/nigeria-had-93-million-registered-voters-but-only-a-quarter-voted-5-reasons-why-201875">Nigeria had 93 million registered voters, but only a quarter voted: 5 reasons why</a>
</strong>
</em>
</p>
<hr>
<h2>What’s the most popular day to hold an election?</h2>
<p>We looked around the world to see when people vote. We collected turnout data for 3,217 national elections between 1945 and 2020 in 190 countries. We then collated the data and created an <a href="https://gdturnout.com/">original dataset</a> on turnout.</p>
<p>The first thing we can assess is which day of the week most global elections are held.</p>
<p><iframe id="0YM9T" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/0YM9T/1/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<p>The graph shows, in general, voting takes place on weekends (more than 60% of elections), with Sunday being the preferred day. The day on which the fewest elections are held is Friday.</p>
<p>We could also examine how many countries choose a given day of the week to hold their elections. The graph below shows that 94 countries chose a Sunday for polling day, while just eight went with a Friday.</p>
<p><iframe id="CXAje" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/CXAje/1/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<p>Interestingly, this preference for Sunday elections is not evident in countries with a significant Protestant Anglo cultural influence, in which public activities other than going to church tended to be restricted on Sundays. For example, in Australia, everything used to be closed on Sundays: bars, cinemas, shops, and there were no sporting events (the restrictions were gradually lifted from the 1980s). </p>
<p><iframe id="wuEkH" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/wuEkH/1/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<h2>How does that affect voter turnout?</h2>
<p>So is there any relationship between the day on which you vote and participation? </p>
<p>The studies currently available show varying results. For example, a 2004 <a href="https://www.cambridge.org/core/books/voter-turnout-and-the-dynamics-of-electoral-competition-in-established-democracies-since-1945/7171DEFC791953CCF4071B5614764F94">study</a> that considered 29 countries found that when the election was held on a Sunday, participation was higher. However, when the analysis was expanded to 63 countries, the day of the election did not seem to affect participation.</p>
<p><iframe id="56dk5" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/56dk5/3/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<p>As the graph above shows, the median voter turnout is around 70% for every day of the week. </p>
<p>For example, the average participation on Sundays was 71.6% while on Fridays it was 70%. </p>
<p>Therefore, it does not appear that the day on which the election is held is related to the level of participation. </p>
<p>This answer is simplified, of course. We are mixing democracies and authoritarian countries, places where there is mandatory voting and places where there is not, presidential and parliamentary systems, and countries that hold elections with either one or two rounds, among many other factors. </p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/early-and-mail-in-voting-research-shows-they-dont-always-bring-in-new-voters-194972">Early and mail-in voting: Research shows they don't always bring in new voters</a>
</strong>
</em>
</p>
<hr>
<h2>Why does this matter?</h2>
<p>When to vote (and whether to vote or not) is an issue that matters. Participation is unequal and is used strategically, especially in countries where voting is not compulsory. In some countries, wealthier voters tend to show <a href="https://www.journals.uchicago.edu/doi/full/10.1086/701961">higher participation rates</a> than poorer voters. This is a pattern that has been <a href="https://onlinelibrary.wiley.com/doi/full/10.1111/ajps.12134">identified</a> in the United States and Europe but not necessarily in other countries such as India or Indonesia. </p>
<p>Participation is strategically used by political parties promoting (or disincentivising) voting in different ways and to differing extents. There are blatant examples of parties strategically managing voting around the world. In Kenya, polling booths in some areas have <a href="https://journals.sagepub.com/doi/full/10.1177/0010414020938083">more staff than others</a>, skewing how many people are able to cast a vote before closing time. In the US, strict voter ID laws have acted to <a href="https://www.journals.uchicago.edu/doi/full/10.1086/688343">suppress the votes</a> of some racial and ethnic groups.</p>
<p>Some instances are more insidious. In 2008, Spanish campaign director Elorriaga Pisarik, in referring to undecided socialist voters, <a href="https://cadenaser.com/ser/2008/02/29/espana/1204246224_850215.html">declared</a> “if we can generate enough doubts about the economy, immigration and nationalist issues, maybe they – the socialist voters – will stay at home”. </p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/most-voters-skipped-in-person-on-election-day-when-offered-a-choice-of-how-and-when-to-vote-192706">Most voters skipped 'in person on Election Day' when offered a choice of how and when to vote</a>
</strong>
</em>
</p>
<hr>
<p>Participation also has an intrinsic value. Imagine two scenarios: one in which the candidate wins the election with 51% support, in an election that had a 90% turnout. Then imagine another election where the candidate wins by the same margin but in an election with a 30% turnout. Although both victories are valid, we tend to attribute greater legitimacy to the one that has brought more people to the polls. </p>
<p>In a year when more than half the world’s population <a href="https://theconversation.com/more-than-4-billion-people-are-eligible-to-vote-in-an-election-in-2024-is-this-democracys-biggest-test-220837">will vote</a> in a national election, it’s worth including data in the global discussion.</p><img src="https://counter.theconversation.com/content/222143/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Ferran Martinez i Coma receives funding from Australian Research Council DP190101978. </span></em></p><p class="fine-print"><em><span>Diego Leiva does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>Voter turnout, or the amount of people that turn up to vote in an election, is key to upholding democratic values. Does it matter on which day a country goes to the polls?Ferran Martinez i Coma, Senior Lecturer in Political Science, Griffith UniversityDiego Leiva, Postdoctoral Research Fellow, Griffith UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2209332024-02-01T17:17:06Z2024-02-01T17:17:06ZWe are living in a ‘digital dark age’ – here’s how to protect your photos, videos and other data<figure><img src="https://images.theconversation.com/files/569645/original/file-20240116-25-6pkv4l.jpg?ixlib=rb-1.1.0&rect=215%2C146%2C5535%2C3681&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">If your computer crashed, would you be able to access your data?</span> <span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/frustrated-worried-young-woman-looks-laptop-1047662398">Nebojsa Tatomirov/Shutterstock</a></span></figcaption></figure><p>If you have grown up with social media, chances are you have taken more photos in the last couple of decades than you will ever remember. When mobile phones suddenly became cameras too, social media turned into a community photo album, with memories kept online forever and ever. Or so we thought. </p>
<p>In 2019, <a href="https://mashable.com/article/myspace-data-loss#:%7E:text=leading%20social%20network.-,Millions%20of%20songs%2C%20photos%2C%20and%20videos%20that%20were%20uploaded%20to,with%20no%20chance%20of%20recovery.">MySpace lost 12 years’ worth of music and photos</a>, affecting over 14 million artists and 50 million tracks. If Instagram or the entire internet suddenly disappeared, would you be able to access your precious memories? </p>
<p>We are living in a <a href="https://www.giantfreakinrobot.com/cltr/digital-dark-age.html">“digital dark age”</a>, a term popularised by information and communication specialist Terry Kuny. Back in 1997, Kuny <a href="https://archive.ifla.org/IV/ifla63/63kuny1.pdf">warned</a> we were “moving into an era where much of what we know today, much of what is coded and written electronically, will be lost forever”. </p>
<p>He argued that, like monks from the Middle Ages who preserved books (and therefore, knowledge), we must preserve digital objects of today. Otherwise, future generations will be left with gaps in knowledge about our present-day lives.</p>
<hr>
<figure class="align-right ">
<img alt="Quarter life, a series by The Conversation" src="https://images.theconversation.com/files/451343/original/file-20220310-13-1bj6csd.png?ixlib=rb-1.1.0&q=45&auto=format&w=237&fit=clip" srcset="https://images.theconversation.com/files/451343/original/file-20220310-13-1bj6csd.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=600&fit=crop&dpr=1 600w, https://images.theconversation.com/files/451343/original/file-20220310-13-1bj6csd.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=600&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/451343/original/file-20220310-13-1bj6csd.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=600&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/451343/original/file-20220310-13-1bj6csd.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=754&fit=crop&dpr=1 754w, https://images.theconversation.com/files/451343/original/file-20220310-13-1bj6csd.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=754&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/451343/original/file-20220310-13-1bj6csd.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=754&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption"></span>
</figcaption>
</figure>
<p><em><strong><a href="https://theconversation.com/uk/topics/quarter-life-117947?utm_source=TCUK&utm_medium=linkback&utm_campaign=UK+YP2022&utm_content=InArticleTop">This article is part of Quarter Life</a></strong>, a series about issues affecting those of us in our twenties and thirties. From the challenges of beginning a career and taking care of our mental health, to the excitement of starting a family, adopting a pet or just making friends as an adult. The articles in this series explore the questions and bring answers as we navigate this turbulent period of life.</em></p>
<p><em>You may be interested in:</em></p>
<p><em><a href="https://theconversation.com/instapoetry-is-successful-and-theres-nothing-wrong-with-that-222012?utm_source=TCUK&utm_medium=linkback&utm_campaign=UK+YP2022&utm_content=InArticleTop">Instapoetry is successful and there’s nothing wrong with that</a></em></p>
<p><em><a href="https://theconversation.com/tiktoks-pomegranate-obsession-the-trendy-fruit-was-also-big-during-the-renaissance-to-talk-about-female-fertility-221440?utm_source=TCUK&utm_medium=linkback&utm_campaign=UK+YP2022&utm_content=InArticleTop">TikTok’s pomegranate obsession: the trendy fruit was also big during the Renaissance to talk about female fertility</a></em></p>
<p><em><a href="https://theconversation.com/is-someone-using-your-pictures-to-catfish-your-rights-when-it-comes-to-fake-profiles-and-social-media-stalking-214418?utm_source=TCUK&utm_medium=linkback&utm_campaign=UK+YP2022&utm_content=InArticleTop">Is someone using your pictures to catfish? Your rights when it comes to fake profiles and social media stalking</a></em></p>
<hr>
<p>People often say the “internet is forever”, but digital artefacts like photos and videos are actually unstable and non-permanent. You’ve likely encountered <a href="https://www.cjr.org/analysis/linkrot-content-drift-new-york-times.php">“linkrot”</a>, when a URL to an important source leads to a now-deleted webpage. Hardware becomes obsolete, degraded and upgraded over time. <a href="https://www.pcmag.com/encyclopedia/term/bit-rot">Bit-rot</a> (also called data or file rot, or data degradation) means we may have no physical means to access our past data. </p>
<p>Many people already find it hard to use technology and software that has reached its <a href="https://endoflife.software/what-is-eol">“end of life”</a>. With the lack of backwards compatibility (when updated technology or software cannot support older versions), how will future generations access old data stored in obsolete formats?</p>
<p>We are also seeing issues emerge related to ownership of data, particularly when controlled by private corporations. Families have faced <a href="https://www.theguardian.com/media/2022/dec/05/uk-families-call-for-easier-access-to-deceased-childrens-social-media-history">legal difficulties</a> accessing the social media accounts of deceased loved ones. Similarly, if Spotify or Netflix shut down tomorrow, you wouldn’t own any of the songs or films you stream on a daily basis.</p>
<h2>A digital life</h2>
<p>For a number of reasons, you may not even notice that we are in the middle of a new digital dark age.</p>
<p>From <a href="https://home.google.com/intl/en_uk/welcome/">Google smart homes</a> to <a href="https://commission.europa.eu/strategy-and-policy/coronavirus-response/travel-during-coronavirus-pandemic/contact-tracing-and-warning-apps-during-covid-19_en">contact-tracing technology</a>, life is increasingly <a href="https://www.penguin.co.uk/books/183571/to-save-everything-click-here-by-morozov-evgeny/9780241957707">digital</a>. Without an app, internet or social media account, it is difficult to verify your identity and gain access to data – even your own. Many people don’t even consider non-digital means of recording, proving and living their existence.</p>
<p>With <a href="https://help.instagram.com/1729008150678239">Instagram stories</a> disappearing after 24 hours, and <a href="https://help.snapchat.com/hc/en-gb/articles/7012334940948-When-does-Snapchat-delete-Snaps-and-Chats-#:%7E:text=Snapchat%20servers%20are%20designed%20to,bounce%20Snaps%20in%20your%20conversation.">Snapchat</a> and WhatsApp’s <a href="https://faq.whatsapp.com/673193694148537">vanishing messages</a> features, you are probably used to data disappearing instantly.</p>
<p>With the growing need for environmental sustainability, turning to digital formats seems like the responsible solution to <a href="https://www.greenbiz.com/article/more-zoom-less-climate-gloom-moving-events-online-can-drastically-cut-carbon">reducing our carbon footprint</a> – though have you thought about the <a href="https://www.nhm.ac.uk/discover/what-is-ewaste-and-what-can-we-do-about-it.html#:%7E:text=E%2Dwaste%20(electronic%20waste),air%20conditioners%20to%20children's%20toys.">e-waste</a> you produce?</p>
<p>Even with data protection laws now giving people the <a href="https://gdpr-info.eu/art-17-gdpr/">right to have personal data erased</a>, many may <em>not</em> want their data to be preserved forever. <a href="https://www.experian.com/blogs/ask-experian/can-facial-recognition-protect-you-from-fraud/">Identity theft</a> can occur with social media content that reveals <a href="https://www.trendmicro.com/vinfo/au/security/news/internet-of-things/leaked-today-exploited-for-life-how-social-media-biometric-patterns-affect-your-future">biometric or other personal data</a>. And that’s not to mention cyberstalking, cyberbullying, the distribution of “revenge porn” and online grooming.</p>
<p>But despite all these very understandable concerns, there are still good reasons to think seriously about how you preserve the digital artefacts and data that are most important to you.</p>
<figure class="align-center ">
<img alt="A young man smiles while browsing a selection of vinyl records in a shop" src="https://images.theconversation.com/files/571181/original/file-20240124-29-eyipve.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/571181/original/file-20240124-29-eyipve.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/571181/original/file-20240124-29-eyipve.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/571181/original/file-20240124-29-eyipve.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/571181/original/file-20240124-29-eyipve.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/571181/original/file-20240124-29-eyipve.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/571181/original/file-20240124-29-eyipve.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">If Spotify crashed tomorrow, how would you listen to your favourite albums?</span>
<span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/happy-ethnic-man-choosing-vinyl-record-2323959907">Guillem de Balanzo/Shutterstock</a></span>
</figcaption>
</figure>
<h2>Protecting and preserving your old data</h2>
<p>If you misplaced your phone, could you remember important phone numbers, or navigate streets when lost? If the answer is no, you may want to think more carefully about data preservation. </p>
<p>This is something we should all think about, and not just leave it to digital archivists and preservationists. When organised efforts are made to preserve data, <a href="https://theconversation.com/it-will-take-critical-thorough-scrutiny-to-truly-decolonise-knowledge-78477">who decides what should be preserved</a> can become a political issue as much as a technological one. </p>
<p>When it comes to your own digital memories, there are services you can use and steps you can take to preserve data from being lost to history:</p>
<ul>
<li><p>Keep multiple copies (and formats) of important data across <a href="https://www.theguardian.com/technology/askjack/2018/apr/12/how-can-i-store-my-digital-photos-for-ever-external-hard-drive">different devices</a>: SD cards, USB thumb drives, DVD/Blu-ray discs, external hard drives and NAS (network attached storage) boxes. This has to be coupled with ensuring you regularly migrate important data to the newest device or format (remember, <a href="https://www.techadvisor.com/article/741570/bit-rot-how-to-avoid-the-slow-death-of-hard-drives-and-ssds.html">avoid bit-rot</a>).</p></li>
<li><p>Try (re)discovering <a href="https://saxdavid.com/therevengeofanalog">analogue trends</a> – board games alongside video games, vinyl records over streaming music, or celebrate the <a href="https://www.yahoo.com/lifestyle/polaroid-camera-made-comeback-yours-103607235.html">resurgence of Polaroid cameras</a>. Many services are available to convert digital photos into printed photos, albums and physical artwork.</p></li>
<li><p>Embrace the ethos of the <a href="https://www.ccdc.cam.ac.uk/solutions/about-the-csd/fair-data-principles/">FAIR principles</a>) – findable, accessible, interoperable, and reusable– so that you and others can locate and access any important data you wish to preserve easily.</p></li>
<li><p>Finally, if you come across a rotten link or other missing data, you can explore data preservation initiatives like the Long Now Foundation’s publicly accessible <a href="https://rosettaproject.org/?ref=longnow.org">Rosetta Project</a> or the <a href="https://archive.org/">Internet Archive</a>, a non-profit library of free digital books, movies, software, music and websites.</p></li>
</ul><img src="https://counter.theconversation.com/content/220933/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Esperanza Miyake does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>Future generations may not be able to access the digital artefacts we create today.Esperanza Miyake, Chancellor's Fellow - Journalism, Media and Communication, University of Strathclyde Licensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2215072024-01-30T16:52:38Z2024-01-30T16:52:38ZAI companies are merging or collaborating to even out the gap in access to vital datasets<figure><img src="https://images.theconversation.com/files/572123/original/file-20240130-25-xtsqyu.jpg?ixlib=rb-1.1.0&rect=17%2C5%2C3817%2C2149&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">
</span> <span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/futuristic-concept-data-center-chief-technology-2212675527">Gorodenkoff / Shutterstock</a></span></figcaption></figure><p>Some recent mergers, acquisitions and investments in the business world have highlighted the strategic value of data to companies. These businesses are not just buying assets or market share – they are also acquiring or investing in large, complementary datasets. This process is known in the business world as horizontal integration.</p>
<p>This integration can drive innovation and provide competitive advantages. It can also open up new revenue streams. Some examples include Microsoft’s acquisitions of <a href="https://news.microsoft.com/announcement/microsoft-buys-linkedin/">LinkedIn</a> and GitHub as well as Amazon’s acquisitions of WholeFoods and the Washington Post. Then there has been Discovery Communications’ merger with Warner Brothers, IBM’s investment in <a href="https://www.theverge.com/2024/1/25/24050445/google-cloud-hugging-face-ai-developer-access">Hugging Face</a> and <a href="https://www.forbes.com/sites/qai/2023/10/31/google-invests-in-anthropic-for-2-billion-as-ai-race-heats-up/?sh=35cb756e664e">Google’s investment in Anthropic</a>.</p>
<p>As the last two examples illustrate, data is extremely important for AI companies. It’s vital for <a href="https://www.forbes.com/sites/forbestechcouncil/2022/06/27/training-data-the-overlooked-problem-of-modern-ai/?sh=37d47ee8218b">“training”, or improving, AI systems</a>. Training AI systems on large, new, varied data sets allows companies to <a href="https://hai.stanford.edu/news/data-centric-ai-ai-models-are-only-good-their-data-pipeline">develop more advanced, more powerful AI systems</a>.</p>
<p>But against the background of this scramble, there is also a growing consensus that some form of regulation is needed to address the ethical, safety and fairness concerns associated with AI.</p>
<p>But regulating AI presents a unique set of challenges. This is mainly due to its foundation on intangible elements such as software and algorithms. These elements can be easily modified, replicated and distributed across borders with few physical traces. This helps them evade traditional regulatory mechanisms that rely on controlling physical goods or specific locations. </p>
<p>Yet a promising approach to regulating AI is one that would focus on controlling access to the very data that is the lifeblood of AI development. Since data is behind the rise of <a href="https://www.indeed.com/career-advice/career-development/horizontal-integration">horizontal integration</a> as well as fuelling the growth and sophistication of AI systems, its concentration in the hands of a few entities can lead to monopolistic dominance. In short, it gives too much power to too few companies.</p>
<h2>Antitrust model</h2>
<p>To mitigate this, <a href="https://www.europarl.europa.eu/news/en/headlines/society/20230601STO93804/eu-ai-act-first-regulation-on-artificial-intelligence">regulatory frameworks</a> could be designed that resemble existing <a href="https://competition-policy.ec.europa.eu/antitrust-and-cartels_en#:%7E:text=Antitrust%20rules%20prohibit%20agreements%20between,and%20the%20abuse%20of%20dominance.">antitrust laws</a> – but focused around data aggregation. They would help ensure a diverse and competitive landscape in the access to data. By preventing any single company from amassing an overwhelming data advantage, these regulations would aim to foster a more balanced field. Innovation must be allowed to thrive without being stifled by monopolistic control.</p>
<p>To properly achieve this outcome, we suggest that regulators need to look at limiting horizontal integration. As AI technologies continue to evolve and the demand for diverse and extensive datasets grows, companies will increasingly be motivated to pursue horizontal integration. </p>
<p>This trend towards integration not only consolidates data assets but also potentially reduces competition, as fewer companies come to control larger shares of valuable data. Therefore, regulatory scrutiny of such mergers and acquisitions becomes essential to ensure a competitive landscape where data does not become excessively concentrated in a few hands.</p>
<p>It’s important to note that the trend towards horizontal integration is already moderated to some extent by regulatory and ethical considerations, particularly around data privacy and existing antitrust laws. These considerations play a critical role in shaping the extent and nature of integration. </p>
<figure class="align-center ">
<img alt="AI representation." src="https://images.theconversation.com/files/572200/original/file-20240130-23-vb6lpu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/572200/original/file-20240130-23-vb6lpu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/572200/original/file-20240130-23-vb6lpu.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/572200/original/file-20240130-23-vb6lpu.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/572200/original/file-20240130-23-vb6lpu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/572200/original/file-20240130-23-vb6lpu.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/572200/original/file-20240130-23-vb6lpu.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Powerful AI systems rely on high quality</span>
<span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/man-using-website-software-technology-ai-2300796931">Deemerwha studio / Shutterstock</a></span>
</figcaption>
</figure>
<h2>The benefits of more data</h2>
<p>When organisations integrate horizontally, they access a more comprehensive pool of data, filling gaps present in individual datasets. This amalgamation not only improves the reliability and accuracy of data but also broadens the perspective, offering deeper insights that are crucial for making informed decisions. </p>
<p>For instance, in merging customer demographic data with purchase history, companies can gain a more nuanced understanding of consumer behaviour. This is invaluable in today’s customer-centric market landscape.</p>
<p>Horizontal integration for AI aligns helps modern companies with their operational efficiency. Companies with similar markets or customer bases can optimise their processes based on richer, more comprehensive data insights. </p>
<p>This leads to improved efficiency in data collection and analysis. This is because making use of existing complementary datasets is more efficient and cost-effective than generating new data from scratch. Companies that successfully use combined datasets can better understand and predict customer needs and market trends. This advantage is especially important in industries where innovation and adaptability are key to survival and growth.</p>
<h2>A balancing act</h2>
<p>Despite the benefits for companies, the potential harm to market competition and consumer welfare from data consolidation necessitates a response. Centralising extensive datasets under dominant entities can potentially marginalise smaller competitors and stifle market diversity.</p>
<p>It also poses privacy concerns and amplifies the risk of market manipulation, diminishing consumer choice and impeding innovation. The potential benefits of data consolidation for customers include enhanced product offerings and personalised services. It is crucial that regulatory frameworks adopt a “rule of reason” approach. They would diligently scrutinise these activities under merger laws or abuse of dominance laws. This ensures a balanced market ecosystem, mitigates potential harm and safeguards competition and consumer interests.</p>
<p>In conclusion, the argument for horizontal integration in the age of AI is compelling. The synthesis of complementary datasets through such integration offers enhanced data quality, improved AI and machine learning capabilities. It provides operational efficiencies and strategic market advantages. </p>
<p>But we must take a balanced approach, weighing the benefits of integration against the ethical implications and regulatory compliance. The future of business in the AI era will likely be characterised by a continued trend towards strategic integration, shaping the way companies operate and compete. </p>
<p>If left unchecked, horizontal integration will concentrate the power of data in the hands of a few. This which will raise safety concerns and is likely to inhibit competition. But regulation based around antitrust principles – where an organisation steps in to prevent companies from behaving in ways that exclude competitors – could help prevent this.</p><img src="https://counter.theconversation.com/content/221507/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>The authors do not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and have disclosed no relevant affiliations beyond their academic appointment.</span></em></p>There are risks in huge datasets sitting in the hands of just a few companies.Karl Schmedders, Professor of Finance, International Institute for Management Development (IMD)José Parra-Moyano, Professor, International Institute for Management Development (IMD)Michael Wade, Professor of Innovation and Strategy, Cisco Chair in Digital Business Transformation, International Institute for Management Development (IMD)Licensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2193562024-01-25T16:01:30Z2024-01-25T16:01:30ZSpreadsheet errors can have disastrous consequences – yet we keep making the same mistakes<figure><img src="https://images.theconversation.com/files/570338/original/file-20240119-21-5frvd3.jpg?ixlib=rb-1.1.0&rect=75%2C0%2C8386%2C5573&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Making mistakes with spreadsheets can not only cause us personal frustration but can also lead to some very serious consequences. </span> <span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/sad-tired-medical-coding-bill-spreadsheets-2197496803">Andrey_Popov/Shutterstock</a></span></figcaption></figure><p>Spreadsheet blunders aren’t just frustrating personal inconveniences. They can have serious consequences. And in the last few years alone, there have been a myriad of spreadsheet horror stories. </p>
<p>In August 2023, the Police Service of Northern Ireland <a href="https://www.bbc.co.uk/news/uk-northern-ireland-66445452">apologised</a> for a data leak of “monumental proportions” when a spreadsheet that contained statistics on the number of officers it had and their rank was shared online in response to a freedom of information request. </p>
<p>There was a second overlooked tab on the spreadsheet that contained the personal details of 10,000 serving police officers. </p>
<p>A <a href="https://anro.wm.hee.nhs.uk/Portals/3/Anaesthetics%20Recruitment%20-%20Significant%20Incident%20Report%20-%20Dec%2021.pdf?ver=hqDrm_-syzeLmBcfbigWJA%3D%3D">series of spreadsheet errors</a> disrupted the recruitment of trainee anaesthetists in Wales in late 2021. The Anaesthetic National Recruitment Office (ANRO), the body responsible for their selection and recruitment, told all the candidates for positions in Wales they were “unappointable”, despite some of them achieving the highest interview scores.</p>
<p>The blame fell on the process of consolidating interview data. Spreadsheets from different areas lacked standardisation in formatting, naming conventions and overall structure. To make matters worse, data was manually copied and pasted between various spreadsheets, a time-consuming and error-prone process.</p>
<p>ANRO only discovered the blunder when rejected applicants questioned their dismissal letters. The fact that not a single candidate seemed acceptable for Welsh positions should have been a red flag. No testing or validation was apparently applied to the crucial spreadsheet, a simple step that could have prevented this critical error.</p>
<p>In 2021, Crypto.com, an online provider of cryptocurrency, <a href="https://www.theguardian.com/australia-news/2023/sep/24/a-crypto-firm-sent-a-disability-worker-10m-by-mistake-months-later-she-was-arrested-at-an-australian-airport">accidentally transferred</a> US$10.5 million (£8.3 million) instead of US$100 into the account of an Australian customer due to an incorrect number being entered on a spreadsheet. </p>
<p>The clerk who processed the refund for the Australian customer had wrongly entered her bank account number in the refund field in a spreadsheet. It was seven months before the mistake was spotted. The recipient attempted to flee to Malaysia but was stopped at an Australian airport carrying a large amount of cash.</p>
<p>In 2022, Íslandsbanki, a state-owned Icelandic bank, sold a portion of shares that were badly undervalued due to a <a href="https://www.bloomberg.com/news/articles/2022-11-14/bungled-excel-sheet-hurts-profits-from-islandsbanki-sale">spreadsheet error</a>. When consolidating assets from different spreadsheets, the spreadsheet data was not “cleaned” and formatted properly. The bank’s shares were subsequently undervalued by as much as £16 million. </p>
<h2>The dark matter of corporate IT</h2>
<p>The above is just a fraction of the spreadsheet errors that are regularly made by various organisations. </p>
<p>Spreadsheets represent unknown risks in the form of errors, privacy violations, trade secrets and compliance violations. Yet they are also critical for the way many organisations make their decisions. For this reason, they have been <a href="https://www.igi-global.com/article/end-user-computing/81295">described</a> by experts as the “dark matter” of corporate IT. </p>
<p>Industry <a href="https://www.igi-global.com/article/know-spreadsheet-errors/55750">studies</a> show that 90% of spreadsheets containing more than 150 rows have at least one major mistake. </p>
<p>This is understandable because spreadsheet errors are easy to make but difficult to spot. My <a href="https://aisel.aisnet.org/cais/vol25/iss1/34/">own research</a> has shown that inspecting the spreadsheet’s code is the most effective way of debugging them, but this approach still only catches between 60% and 80% of all errors. </p>
<figure class="align-center ">
<img alt="A close up of Microsoft Excel spreadsheet." src="https://images.theconversation.com/files/570385/original/file-20240119-15-gebegy.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/570385/original/file-20240119-15-gebegy.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=401&fit=crop&dpr=1 600w, https://images.theconversation.com/files/570385/original/file-20240119-15-gebegy.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=401&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/570385/original/file-20240119-15-gebegy.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=401&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/570385/original/file-20240119-15-gebegy.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/570385/original/file-20240119-15-gebegy.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/570385/original/file-20240119-15-gebegy.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">As many as 9 out of 10 spreadsheets are estimated to contain errors.</span>
<span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/new-york-usa-august-18-2017-699112366">PixieMe/Shutterstock</a></span>
</figcaption>
</figure>
<p>Spreadsheets’ appeal doesn’t just exist in the financial world. They are indispensable in <a href="https://eusprig.org/wp-content/uploads/1801.10231.pdf">engineering</a>, <a href="https://ijcis.net/index.php/ijcis/article/view/79">data science</a> and even in <a href="https://ntrs.nasa.gov/citations/20150008644">sending robots</a> to Mars. The key to their success is their flexibility. </p>
<p>Spreadsheet software is constantly evolving, with more features becoming available that increase their appeal. For instance, you can now automate many tasks in Excel (the most popular spreadsheet software) using Python scripting.</p>
<p>But given all of the aforementioned problems, isn’t it time for Excel and other spreadsheet software to be sidelined in favour of something more reliable? </p>
<h2>Human error</h2>
<p>The underlying cause of these spreadsheet problems is not the software but human error. </p>
<p>The issue is that most users don’t see the need to plan or test their work. Most users <a href="https://www.igi-global.com/article/errors-operational-spreadsheets/4145">describe</a> their first step in creating a new spreadsheet as merely jumping straight in and entering numbers or code directly. </p>
<p>Many of us don’t consider spreadsheets to warrant serious consideration. This means we become <a href="https://eusprig.org/wp-content/uploads/0804.0941.pdf">complacent</a> and assume there is no need to test, validate or verify our work.</p>
<p><a href="https://www.igi-global.com/gateway/article/3762">Research</a> on “cognitive load”, the amount of mental effort required for a task, shows that building complex spreadsheets demands as much concentration as a GP making a diagnosis. This intense mental strain makes mistakes more likely. But GPs study their profession for many years before becoming qualified while most spreadsheet users are <a href="https://eusprig.org/wp-content/uploads/0803.1862.pdf">self-taught</a>. </p>
<p>To break the cycle of repeated spreadsheet errors, there are several things organisations can do. First, introducing standardisation would help to minimise confusion and mistakes. For example, this would mean consistent formatting, naming conventions and data structures across spreadsheets.</p>
<p>Second, improving training is crucial. Equipping users with the knowledge and skills to build robust and accurate spreadsheets could help them identify and avoid pitfalls. </p>
<p>Finally, fostering a culture of critical thinking towards spreadsheets is vital. This would mean encouraging users to continually question calculations, validate their data sources and double-check their work.</p><img src="https://counter.theconversation.com/content/219356/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Simon Thorne is affiliated with The European Spreadsheets Risks Interest Group</span></em></p>Spreadsheet-related errors can have serious consequences in the private and public sector. But what can we do to overcome them?Simon Thorne, Senior Lecturer in Computing and Information Systems, Cardiff Metropolitan UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2182012024-01-19T13:41:44Z2024-01-19T13:41:44ZI’m an artist using scientific data as an artistic medium − here’s how I make meaning<figure><img src="https://images.theconversation.com/files/569152/original/file-20240112-27-8u7iv7.jpeg?ixlib=rb-1.1.0&rect=2%2C0%2C1393%2C932&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Sarah Nance at the Bonneville Salt Flats, Utah, 2019.</span> <span class="attribution"><span class="source">Courtesy of Sarah Nance</span></span></figcaption></figure><p>As an <a href="https://www.binghamton.edu/art/profile.html?id=snance">artist working across media</a>, I’ve used everything from thread to my voice to poetically translate and express information. Recently, I’ve been working with another medium – geologic datasets. </p>
<p>While scientists use data visualization to show the results of a dataset in interesting and informative ways, my goal as an artist is a little different. In the studio, I treat geologic data as another material, using it to guide my interactions with Mylar film, knitting patterns or opera. Data, in my work, functions expressively and abstractly. </p>
<p>Two of my projects in particular, “points of rupture” and “tidal arias,” exemplify this way of working. In these pieces, my goal is to offer new ways for people to personally relate to the immense scale of geologic time.</p>
<h2>Points of rupture</h2>
<p>An early project in which I treated data as a medium was my letterpress print series “<a href="https://www.sarahnance.com/shroud/alaska">points of rupture</a>.” In this series, I encoded data from <a href="https://www.britannica.com/science/cryoseism">cryoseismic, or ice quake</a>, events to create knitting patterns. </p>
<p>Working with ice quake data was a continuation of my research into what I call “archived landscapes.” These are places that have had multiple distinct geologic identities over time, like <a href="https://www.nps.gov/gumo/learn/nature/coralreefs.htm">mountains that were once sea reefs</a>.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/569121/original/file-20240112-17-umjli0.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="silver knitting symbols on black background" src="https://images.theconversation.com/files/569121/original/file-20240112-17-umjli0.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/569121/original/file-20240112-17-umjli0.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=600&fit=crop&dpr=1 600w, https://images.theconversation.com/files/569121/original/file-20240112-17-umjli0.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=600&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/569121/original/file-20240112-17-umjli0.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=600&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/569121/original/file-20240112-17-umjli0.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=754&fit=crop&dpr=1 754w, https://images.theconversation.com/files/569121/original/file-20240112-17-umjli0.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=754&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/569121/original/file-20240112-17-umjli0.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=754&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">‘points of rupture (alaska glacial event 1999),’ 2020. Letterpress print of knitting pattern coded using cryoseismic data. Edition of 15. 18 x 18 in.</span>
<span class="attribution"><span class="source">Sarah Nance</span></span>
</figcaption>
</figure>
<p>Because knit textiles are made up of many individual stitches, I can use them to encode discrete data points. In a knitting pattern, or chart, each kind of stitch is represented by a specific symbol. I used the open-source program <a href="https://stitch-maps.com">Stitch Maps</a> to write the patterns for this project, translating the peaks and valleys of seismographs into individual stitch symbols. </p>
<p>Knitting charts typically display these symbols in a grid. Instead, Stitch Maps allows them to fall as they would when knitted, so the chart mimics the shape of the final textile. </p>
<p>I was drawn to the expressive possibilities of this feature and how the software allowed me to experiment. I was able to write patterns that worked only in theory and not as physical, handmade structures. This gave me more freedom to design patterns that fully expressed the datasets without having to ensure their viability as textiles.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/568495/original/file-20240109-29-ojgmd6.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="graphite drawing of mitten knitting chart on gallery wall" src="https://images.theconversation.com/files/568495/original/file-20240109-29-ojgmd6.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/568495/original/file-20240109-29-ojgmd6.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/568495/original/file-20240109-29-ojgmd6.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/568495/original/file-20240109-29-ojgmd6.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/568495/original/file-20240109-29-ojgmd6.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/568495/original/file-20240109-29-ojgmd6.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/568495/original/file-20240109-29-ojgmd6.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">‘and when you change the landscape, is it with bare hands or with gloves? (lichen, woodwork, grate),’ 2023. Graphite drawing of selbu mitten knitting chart. 99 x 67 linear inches as installed.</span>
<span class="attribution"><span class="source">Sarah Nance</span></span>
</figcaption>
</figure>
<p><a href="https://nsidc.org/learn/parts-cryosphere/glaciers">Glaciers form</a> incrementally as new snowfall compacts previous layers of snow, crystallizing them into ice. A knitted fabric similarly accumulates in layers, as rows of interlocking loops. Each structure appears stable but could easily be dissolved.</p>
<p>Ice quakes occur in glaciers as a result of <a href="https://www.britannica.com/science/cryoseism">calving events or pooling meltwater</a>. Like melting glaciers, knitting is always in danger of coming apart – but instead of melting, by snagging and unraveling into formlessness. These structural similarities between glaciers and knitting are reflected in the “points of rupture” prints, where disruptive ice quakes translate into unknittable patterns. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/569080/original/file-20240112-19-758bfo.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="silver knitting symbols on black background" src="https://images.theconversation.com/files/569080/original/file-20240112-19-758bfo.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/569080/original/file-20240112-19-758bfo.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=600&fit=crop&dpr=1 600w, https://images.theconversation.com/files/569080/original/file-20240112-19-758bfo.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=600&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/569080/original/file-20240112-19-758bfo.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=600&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/569080/original/file-20240112-19-758bfo.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=754&fit=crop&dpr=1 754w, https://images.theconversation.com/files/569080/original/file-20240112-19-758bfo.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=754&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/569080/original/file-20240112-19-758bfo.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=754&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">‘points of rupture (glacier de la plaine morte icequake 2016),’ 2020. Letterpress print of knitting pattern coded using cryoseismic data. Edition of 15. 18 x 18 in.</span>
<span class="attribution"><span class="source">Sarah Nance</span></span>
</figcaption>
</figure>
<h2>The loop</h2>
<p>Repeated, interlocking loops are the base units that compose the structure of a knitted textile. The loop also forms the seed of an in-progress work I pursued during an artist residency with the <a href="https://lunarscience.nasa.gov/sserviteams">NASA</a> <a href="https://www.geodes.umd.edu">GEODES</a> research group. I joined their research team in Flagstaff, Arizona, in August 2023. I assisted in gathering data from sites within the San Francisco volcanic field, while also conducting my own fieldwork: photography, drawing, note-taking and walking.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/568498/original/file-20240109-21-we196t.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="A digital map showing a crater, with a green circle indicating the path walked, around the lip of the crater." src="https://images.theconversation.com/files/568498/original/file-20240109-21-we196t.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/568498/original/file-20240109-21-we196t.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=629&fit=crop&dpr=1 600w, https://images.theconversation.com/files/568498/original/file-20240109-21-we196t.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=629&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/568498/original/file-20240109-21-we196t.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=629&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/568498/original/file-20240109-21-we196t.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=790&fit=crop&dpr=1 754w, https://images.theconversation.com/files/568498/original/file-20240109-21-we196t.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=790&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/568498/original/file-20240109-21-we196t.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=790&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Sarah Nance’s walk at S P Crater in Arizona, as recorded in AllTrails.</span>
<span class="attribution"><span class="source">Screenshot of All Trails map</span></span>
</figcaption>
</figure>
<p>One of my walks was a trek around a particularly prominent geologic loop – the rim of the S P cinder cone volcano. This is the second crater walk I’ve completed, the first being a tracing of the subsurface rim of the <a href="https://insider.si.edu/2013/03/iowa-meteorite-crater-confirmed/">Decorah impact structure</a> in Iowa. </p>
<p>I see my paths through these landscapes as stand-ins for yarn. Over time, by taking walks that trace craters, or geologic loops, I will perform a textile. The performance of something as familiar as a textile offers me a new way to think about something that is much more difficult to comprehend – <a href="https://www.britannica.com/science/geologic-time">geologic time</a>. </p>
<hr>
<figure class="align-right zoomable">
<a href="https://images.theconversation.com/files/567788/original/file-20240103-23-yg479z.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="A square box with the words 'Art & Science Collide' and a drawing of a lightbulb with its wire filament in the shape of a brain, surrounded by a circle." src="https://images.theconversation.com/files/567788/original/file-20240103-23-yg479z.png?ixlib=rb-1.1.0&q=45&auto=format&w=237&fit=clip" srcset="https://images.theconversation.com/files/567788/original/file-20240103-23-yg479z.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=600&fit=crop&dpr=1 600w, https://images.theconversation.com/files/567788/original/file-20240103-23-yg479z.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=600&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/567788/original/file-20240103-23-yg479z.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=600&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/567788/original/file-20240103-23-yg479z.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=754&fit=crop&dpr=1 754w, https://images.theconversation.com/files/567788/original/file-20240103-23-yg479z.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=754&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/567788/original/file-20240103-23-yg479z.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=754&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Art & Science Collide series.</span>
<span class="attribution"><span class="source">source</span></span>
</figcaption>
</figure>
<p><em><strong><a href="https://theconversation.com/us/topics/art-in-science-series-2024-149583">This article is part of Art & Science Collide</a></strong>, a series examining the intersections between art and science.</em></p>
<p><em>You may be interested in:</em></p>
<p><a href="https://theconversation.com/literature-inspired-my-medical-career-why-the-humanities-are-needed-in-health-care-217357">Literature inspired my medical career: Why the humanities are needed in health care</a></p>
<p><a href="https://theconversation.com/i-wrote-a-play-for-children-about-integrating-the-arts-into-stem-fields-heres-what-i-learned-about-encouraging-creative-interdisciplinary-thinking-218001">I wrote a play for children about integrating the arts into STEM fields – here’s what I learned about interdisciplinary thinking</a> </p>
<p><a href="https://theconversation.com/art-and-science-entwined-this-course-explores-the-long-interrelated-history-of-two-ways-of-seeing-the-world-210250">Art and science entwined: This course explores the long, interrelated history of two ways of seeing the world </a></p>
<hr>
<h2>Performance and tides</h2>
<p>Performance has been a useful tool in my work, as it can help people understand and relate to geologic processes.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/569102/original/file-20240112-21-spkjsd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="artist's hands holding small chunk of glacial ice" src="https://images.theconversation.com/files/569102/original/file-20240112-21-spkjsd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/569102/original/file-20240112-21-spkjsd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=399&fit=crop&dpr=1 600w, https://images.theconversation.com/files/569102/original/file-20240112-21-spkjsd.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=399&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/569102/original/file-20240112-21-spkjsd.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=399&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/569102/original/file-20240112-21-spkjsd.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=502&fit=crop&dpr=1 754w, https://images.theconversation.com/files/569102/original/file-20240112-21-spkjsd.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=502&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/569102/original/file-20240112-21-spkjsd.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=502&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">‘transference,’ 2017. Atlantic sea ice, body heat. Documentation of site-responsive performance on the East Coast Trail, Newfoundland, Canada. Project supported in part by La Soupée, Galerie Diagonale, Montréal, Québec.</span>
<span class="attribution"><span class="source">Sarah Nance</span></span>
</figcaption>
</figure>
<p>The field of geology emerges from a <a href="https://www.upress.umn.edu/book-division/books/a-billion-black-anthropocenes-or-none">long history</a> of extraction and <a href="https://www.dukeupress.edu/geontologies">colonialist ventures</a>. In this context, land is valued for its economic importance – as raw material to be extracted or territory to be claimed. In my performances, I aim to interact with geology as its own active entity, rather than as a consumable resource. </p>
<p>In recent years, I have composed and performed two arias from tidal data. </p>
<p>The first, “<a href="https://www.sarahnance.com/marseille">marseille tidal gauge aria</a>,” sourced 130 years of sea level data collected from a tidal gauge in the Bay of Marseille, France. I converted each yearly average sea level into an individual note within my vocal range. This resulted in a composition that expresses the rising sea levels of the bay as increasingly higher pitches in the aria. </p>
<p>Its lyrics come from a somber poem in Rasu-Yong Tugen’s book “<a href="https://gnomebooks.wordpress.com/2014/02/10/songs-from-the-black-moon/">Songs From the Black Moon</a>.” Each note of the aria communicates not just the measured sea level but also my emotive response to this dataset. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/569098/original/file-20240112-23-ffk4lg.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="Black flexi disc with gold text and image" src="https://images.theconversation.com/files/569098/original/file-20240112-23-ffk4lg.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/569098/original/file-20240112-23-ffk4lg.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/569098/original/file-20240112-23-ffk4lg.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/569098/original/file-20240112-23-ffk4lg.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/569098/original/file-20240112-23-ffk4lg.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/569098/original/file-20240112-23-ffk4lg.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/569098/original/file-20240112-23-ffk4lg.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">‘tidal arias,’ 2022. Limited edition flexi disc with vocal performances ‘marseille tidal gauge aria’ and ‘skagway tidal aria.’</span>
<span class="attribution"><span class="source">Sarah Nance</span></span>
</figcaption>
</figure>
<p>Last fall, “marseille tidal gauge aria” was transmitted <a href="https://www.swpc.noaa.gov/phenomena/ionosphere">to the ionosphere</a>, the boundary between Earth’s atmosphere and outer space. This was done as part of artist Amanda Dawn Christie’s project “<a href="https://ghostsintheairglow.space/transmission/august-2023">Ghosts in the Air Glow</a>,” using the <a href="https://haarp.gi.alaska.edu">High-frequency Active Auroral Research Program</a>’s ionospheric research instrument, which is an array of 180 antennas transmitting high-frequency radio waves. </p>
<p>The aria’s transmission reflected off the ionosphere, back to Earth and to shortwave radio listeners around the world.</p>
<p>For the second of these vocal pieces, “skagway tidal aria,” I used predictive as well as recorded tidal data from Skagway, Alaska. With this data, I composed an aria for <a href="https://t2051mcc.com">The 2051 Munich Climate Conference</a>, where speakers presented from the perspective of a climate-altered world 30 years in the future. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/569106/original/file-20240112-25-4mocnl.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="vocal music score" src="https://images.theconversation.com/files/569106/original/file-20240112-25-4mocnl.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/569106/original/file-20240112-25-4mocnl.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=388&fit=crop&dpr=1 600w, https://images.theconversation.com/files/569106/original/file-20240112-25-4mocnl.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=388&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/569106/original/file-20240112-25-4mocnl.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=388&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/569106/original/file-20240112-25-4mocnl.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=488&fit=crop&dpr=1 754w, https://images.theconversation.com/files/569106/original/file-20240112-25-4mocnl.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=488&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/569106/original/file-20240112-25-4mocnl.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=488&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Score for ‘skagway tidal aria,’ 2021. Recorded and speculative tidal data from Skagway, Alaska (1945-2081), sonified as a vocal composition. Text from ‘Songs From the Black Moon’ by Rasu-Yong Tugen.</span>
<span class="attribution"><span class="source">Sarah Nance</span></span>
</figcaption>
</figure>
<p>I was drawn to this particular dataset because the falling tide levels in Skagway appear to contradict the <a href="https://theconversation.com/what-drives-sea-level-rise-us-report-warns-of-1-foot-rise-within-three-decades-and-more-frequent-flooding-177211">global trend of rising sea levels</a>. However, this is a temporary effect caused by melting glaciers releasing pressure on the land, allowing it to rise faster than water levels. The effect will flatten over the next half-century, and Skagway’s tides will start to rise again.</p>
<p>Over the next few months, I’ll be working with geophysical datasets gathered during the NASA GEODES field expedition to write new arias. I want these pieces to continue blurring the separation between the human and the geologic, inviting listeners to think more deeply about their own relationships with the lands they use and occupy.</p><img src="https://counter.theconversation.com/content/218201/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>The author's projects with GEODES and Ghosts in the Air Glow were supported with funding from these organizations.</span></em></p>Sarah Nance uses geologic data and a variety of artistic media to help people think about their place in the landscapes they use and occupy.Sarah Nance, Assistant Professor of Integrated Practice in Art and Design, Binghamton University, State University of New YorkLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2145762024-01-07T19:04:19Z2024-01-07T19:04:19ZHere’s why you should (almost) never use a pie chart for your data<figure><img src="https://images.theconversation.com/files/558554/original/file-20231109-25-j7ehuz.jpg?ixlib=rb-1.1.0&rect=810%2C436%2C4761%2C3377&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">
</span> <span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/lemon-pie-flat-lay-on-blue-1663719415">YesPhotographers/Shutterstock</a></span></figcaption></figure><p>Our lives are becoming increasingly data driven. Our phones monitor our time and internet usage and online surveys discern our opinions and likes. These data harvests are used for telling us how well we’ve slept or what we might like to buy. </p>
<p>Numbers are becoming more important for everyday life, yet people’s numerical skills are falling behind. For example, the percentage of Year 12 schoolchildren in Australia taking higher and intermediate mathematics <a href="https://amsi.org.au/?publications=year-12-participation-in-calculus-based-mathematics-subjects-takes-a-dive-2">has been declining for decades</a>. </p>
<p>To help the average person understand big data and numbers, we often use visual summaries, such as pie charts. But while non-numerate folk will avoid numbers, most numerate folk will avoid pie charts. Here’s why.</p>
<h2>What is a pie chart?</h2>
<p>A pie chart is a circular diagram that represents numerical percentages. The circle is divided into slices, with the size of each slice proportional to the category it represents. It is named because it resembles a sliced pie and can be “served” in many different ways. </p>
<p>An example pie chart below shows Australia’s two-party preferred vote before the last election, with Labor on 55% and the the Coalition on 45%. The two near semi-circles show the relatively tight race – this is a useful example of a pie chart. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/560670/original/file-20231121-23-sgp640.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="55% for labor, 45% for coalition on a red and blue pie chart" src="https://images.theconversation.com/files/560670/original/file-20231121-23-sgp640.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/560670/original/file-20231121-23-sgp640.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=360&fit=crop&dpr=1 600w, https://images.theconversation.com/files/560670/original/file-20231121-23-sgp640.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=360&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/560670/original/file-20231121-23-sgp640.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=360&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/560670/original/file-20231121-23-sgp640.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=452&fit=crop&dpr=1 754w, https://images.theconversation.com/files/560670/original/file-20231121-23-sgp640.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=452&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/560670/original/file-20231121-23-sgp640.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=452&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">A simple pie chart showing the percentages for the two major Australian parties in an opinion poll.</span>
<span class="attribution"><span class="source">Victor Oguoma</span></span>
</figcaption>
</figure>
<h2>What’s wrong with pie charts?</h2>
<p>Once we have more than two categories, pie charts can easily misrepresent percentages and become hard to read.</p>
<p>The three charts below are a good example – it is very hard to work out which of the five areas is the largest. The pie chart’s circularity means the areas lack a common reference point. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/556782/original/file-20231031-27-3dz8ta.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/556782/original/file-20231031-27-3dz8ta.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/556782/original/file-20231031-27-3dz8ta.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=208&fit=crop&dpr=1 600w, https://images.theconversation.com/files/556782/original/file-20231031-27-3dz8ta.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=208&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/556782/original/file-20231031-27-3dz8ta.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=208&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/556782/original/file-20231031-27-3dz8ta.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=261&fit=crop&dpr=1 754w, https://images.theconversation.com/files/556782/original/file-20231031-27-3dz8ta.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=261&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/556782/original/file-20231031-27-3dz8ta.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=261&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Three example pie charts, each with five similar categories. Can you quickly tell which colour is the largest in each pie?</span>
<span class="attribution"><a class="source" href="https://commons.wikimedia.org/wiki/File:Piecharts.svg">Schutz/Wikimedia Commons</a>, <a class="license" href="http://creativecommons.org/licenses/by/4.0/">CC BY</a></span>
</figcaption>
</figure>
<p>Pie charts also do badly when there are lots of categories. For example, this chart from a study on data sources used for COVID data visualisation shows hundreds of categories in one pie. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/556770/original/file-20231031-19-uurqzu.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/556770/original/file-20231031-19-uurqzu.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/556770/original/file-20231031-19-uurqzu.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=340&fit=crop&dpr=1 600w, https://images.theconversation.com/files/556770/original/file-20231031-19-uurqzu.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=340&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/556770/original/file-20231031-19-uurqzu.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=340&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/556770/original/file-20231031-19-uurqzu.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=428&fit=crop&dpr=1 754w, https://images.theconversation.com/files/556770/original/file-20231031-19-uurqzu.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=428&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/556770/original/file-20231031-19-uurqzu.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=428&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">A pie chart with dozens of categories. Not every category has a label, it’s not clear what the total number of categories is and what the unlabelled slices refer to.</span>
<span class="attribution"><a class="source" href="https://doi.org/10.3390/informatics7030035">Trajkova et al., Informatics (2020)</a>, <a class="license" href="http://creativecommons.org/licenses/by/4.0/">CC BY</a></span>
</figcaption>
</figure>
<p>The tiny slices, lack of clear labelling and the kaleidoscope of colours make interpretation difficult for anyone.</p>
<p>It’s even harder for a colour blind person. For example, this is a simulation of what the above chart would look like to a person with deuteranomaly or reduced sensitivity to green light. This is the most common type of colour blindness, affecting roughly <a href="https://wearecolorblind.com/articles/a-quick-introduction-to-color-blindness/">4.6% of the population</a>. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/558560/original/file-20231109-27-4714o0.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/558560/original/file-20231109-27-4714o0.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/558560/original/file-20231109-27-4714o0.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=449&fit=crop&dpr=1 600w, https://images.theconversation.com/files/558560/original/file-20231109-27-4714o0.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=449&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/558560/original/file-20231109-27-4714o0.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=449&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/558560/original/file-20231109-27-4714o0.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=564&fit=crop&dpr=1 754w, https://images.theconversation.com/files/558560/original/file-20231109-27-4714o0.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=564&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/558560/original/file-20231109-27-4714o0.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=564&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">The same data chart as above, but run through a simulation filter to demonstrate what it would look like for someone with a common type of colour blindness.</span>
<span class="attribution"><a class="source" href="https://doi.org/10.3390/informatics7030035">Trajkova et al., Informatics (2020); modified.</a>, <a class="license" href="http://creativecommons.org/licenses/by/4.0/">CC BY</a></span>
</figcaption>
</figure>
<p>It can get even worse if we take pie charts and make them three-dimensional. This can lead to egregious misrepresentations of data.</p>
<p>Below, the yellow, red and green areas are all the same size (one-third), but appear to be different based on the angle and which slice is placed at the bottom of the pie.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/556772/original/file-20231031-25-bdpq56.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/556772/original/file-20231031-25-bdpq56.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/556772/original/file-20231031-25-bdpq56.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=196&fit=crop&dpr=1 600w, https://images.theconversation.com/files/556772/original/file-20231031-25-bdpq56.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=196&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/556772/original/file-20231031-25-bdpq56.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=196&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/556772/original/file-20231031-25-bdpq56.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=246&fit=crop&dpr=1 754w, https://images.theconversation.com/files/556772/original/file-20231031-25-bdpq56.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=246&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/556772/original/file-20231031-25-bdpq56.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=246&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">A standard two-dimensional pie chart and two three-dimensional pie charts. In every chart the proportions are one-third but there appear to be differences between states in the three-dimensional versions.</span>
<span class="attribution"><span class="source">Victor Oguoma</span>, <a class="license" href="http://creativecommons.org/licenses/by-nd/4.0/">CC BY-ND</a></span>
</figcaption>
</figure>
<h2>So why are pie charts everywhere?</h2>
<p>Despite the well known problems with pie charts, they are everywhere. They are in journal articles, PhD theses, political polling, books, newspapers and government reports. They’ve even been used by the Australian Bureau of Statistics.</p>
<p>While statisticians have criticised them for decades, it’s hard to argue with this logic: “if pie charts are so bad, why are there so many of them?”</p>
<p>Possibly they are popular because they are popular, which is a circular argument that suits a pie chart.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/556781/original/file-20231031-17-hfvpgr.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/556781/original/file-20231031-17-hfvpgr.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/556781/original/file-20231031-17-hfvpgr.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=395&fit=crop&dpr=1 600w, https://images.theconversation.com/files/556781/original/file-20231031-17-hfvpgr.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=395&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/556781/original/file-20231031-17-hfvpgr.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=395&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/556781/original/file-20231031-17-hfvpgr.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=496&fit=crop&dpr=1 754w, https://images.theconversation.com/files/556781/original/file-20231031-17-hfvpgr.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=496&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/556781/original/file-20231031-17-hfvpgr.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=496&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">A collection of terrible pie charts gathered from various open access sources, including ‘exploded’ pie charts and 3D pie charts.</span>
<span class="attribution"><span class="source">Adrian Barnett and Victor Oguoma</span>, <a class="license" href="http://creativecommons.org/licenses/by-nd/4.0/">CC BY-ND</a></span>
</figcaption>
</figure>
<h2>What’s a good alternative to pie charts?</h2>
<p>There’s a simple fix that can effectively summarise big data in a small space and still allow creative colour schemes. </p>
<p>It’s the humble bar chart. Remember the brain-aching pie chart example above with the five categories? Here’s the same example using bars – we can now instantly see which category is the largest.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/556773/original/file-20231031-25-9vdsm4.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/556773/original/file-20231031-25-9vdsm4.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/556773/original/file-20231031-25-9vdsm4.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=430&fit=crop&dpr=1 600w, https://images.theconversation.com/files/556773/original/file-20231031-25-9vdsm4.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=430&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/556773/original/file-20231031-25-9vdsm4.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=430&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/556773/original/file-20231031-25-9vdsm4.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=541&fit=crop&dpr=1 754w, https://images.theconversation.com/files/556773/original/file-20231031-25-9vdsm4.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=541&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/556773/original/file-20231031-25-9vdsm4.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=541&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Three pie charts, each with five similar categories, and the same data presented using bar charts.</span>
<span class="attribution"><a class="source" href="https://commons.wikimedia.org/wiki/File:Piecharts.svg">Schutz/Wikimedia Commons</a>, <a class="license" href="http://creativecommons.org/licenses/by/4.0/">CC BY</a></span>
</figcaption>
</figure>
<p>Linear bars are easier on the eye than the non-linear segments of a pie chart. But beware the temptation to make a humble bar chart look more interesting by adding a 3D effect. As you already saw, 3D charts distort perception and make it harder to find a reference point.</p>
<p>Below is a standard bar chart and a 3D alternative of the number of voters in the 1992 US presidential election split by family income (from under US$15K to over $75k). Using the 3D version, can you tell the number of voters for each candidate in the highest income category? Not easily. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/556775/original/file-20231031-17-dscfue.png?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/556775/original/file-20231031-17-dscfue.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/556775/original/file-20231031-17-dscfue.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=157&fit=crop&dpr=1 600w, https://images.theconversation.com/files/556775/original/file-20231031-17-dscfue.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=157&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/556775/original/file-20231031-17-dscfue.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=157&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/556775/original/file-20231031-17-dscfue.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=197&fit=crop&dpr=1 754w, https://images.theconversation.com/files/556775/original/file-20231031-17-dscfue.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=197&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/556775/original/file-20231031-17-dscfue.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=197&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">The same voter data presented as a standard two-dimensional bar chart and an unhelpful three-dimensional version.</span>
<span class="attribution"><span class="source">Victor Oguoma</span>, <a class="license" href="http://creativecommons.org/licenses/by-nd/4.0/">CC BY-ND</a></span>
</figcaption>
</figure>
<h2>Is it ever okay to use a pie chart?</h2>
<p>We’ve shown some of the worst examples of pie charts to make a point. Pie charts can be okay when there are just a few categories and the percentages are dissimilar, for example with one large and one small category.</p>
<p>Overall, it is best to use pie charts sparingly, especially when there is a more “digestible” alternative – the bar chart.</p>
<p>Whenever we see pie charts, we think one of two things: their creators don’t know what they’re doing, or they know what they are doing and are deliberately trying to mislead.</p>
<p>A graphical summary aims to easily and quickly communicate the data. If you feel the need to spruce it up, you’re likely reducing understanding without meaning to do so.</p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/3-questions-to-ask-yourself-next-time-you-see-a-graph-chart-or-map-141348">3 questions to ask yourself next time you see a graph, chart or map</a>
</strong>
</em>
</p>
<hr>
<img src="https://counter.theconversation.com/content/214576/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Adrian Barnett is a member of the Statistical Society of Australia.</span></em></p><p class="fine-print"><em><span>Victor Oguoma is a member of the Statistical Society of Australia.</span></em></p>They are popular because they are popular, which is a circular argument that suits a pie chart. But there are some serious downsides to using the humble pie.Adrian Barnett, Professor of Statistics, Queensland University of TechnologyVictor Oguoma, Senior Research Fellow, Poche Centre for Indigenous Health, The University of QueenslandLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2174222023-12-05T17:50:44Z2023-12-05T17:50:44ZWant to know if your data are managed responsibly? Here are 15 questions to help you find out<figure><img src="https://images.theconversation.com/files/563436/original/file-20231204-21-5svi2j.jpg?ixlib=rb-1.1.0&rect=0%2C0%2C5990%2C3506&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Organizations that gather information should establish a framework for responsibly managing user data.</span> <span class="attribution"><span class="source">(Shutterstock)</span></span></figcaption></figure><iframe style="width: 100%; height: 100px; border: none; position: relative; z-index: 1;" allowtransparency="" allow="clipboard-read; clipboard-write" src="https://narrations.ad-auris.com/widget/the-conversation-canada/want-to-know-if-your-data-are-managed-responsibly-here-are-15-questions-to-help-you-find-out" width="100%" height="400"></iframe>
<p>As the volume and variety of data about people increases, so does the number of ideas about how data might be used. Studies show that many <a href="https://doi.org/10.1186/s12910-016-0153-x">people want their data</a> to be used for <a href="https://doi.org/10.1787/276aaca8-en">public benefit</a>. </p>
<p>However, the research also shows that public support for use of data is conditional, and only given when risks such as those related to <a href="https://www.pewresearch.org/internet/2019/11/15/americans-and-privacy-concerned-confused-and-feeling-lack-of-control-over-their-personal-information/">privacy</a>, <a href="https://wellcome.figshare.com/articles/journal_contribution/The_One-Way_Mirror_Public_attitudes_to_commercial_access_to_health_data/5616448">commercial exploitation</a> and <a href="https://www.jmir.org/2021/8/e26162/">artificial intelligence misuse</a> are addressed. </p>
<p>It takes a lot of work for organizations to establish data governance and management practices that mitigate risks while also encouraging beneficial uses of data. So much so, that it can be challenging for responsible organizations to communicate their data trustworthiness without providing an overwhelming amount of technical and legal details.</p>
<p>To address this challenge our team undertook a multiyear project to identify, refine and publish a short list of <a href="https://doi.org/10.23889/ijpds.v8i4.2142">essential requirements for responsible data stewardship</a>.</p>
<p>Our 15 minimum specification requirements (min specs) are based on a review of the scientific literature and the practices of 23 different data-focused organizations and initiatives. </p>
<p>As part of our project, we compiled over 70 public resources, including examples of organizations that address the full list of min specs: <a href="https://www.ices.on.ca/data-repository-requirements/">ICES</a>, the <a href="https://static1.squarespace.com/static/5d8b7b3eabff3c4f1954d802/t/63c9b2638614cc5609a3a0d3/1674163135114/hdc-minspecs.">Hartford Data Collaborative</a> and the <a href="https://www.unb.ca/nbirdt/data/privacy/index.html">New Brunswick Institute for Research, Data and Training</a>.</p>
<p>Our hope is that information related to the min specs will help organizations and data-sharing initiatives share best practices and learn from each other to improve their governance and management of data.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/563439/original/file-20231204-23-rmsqh4.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="a woman sitting on a sofa on a laptop" src="https://images.theconversation.com/files/563439/original/file-20231204-23-rmsqh4.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/563439/original/file-20231204-23-rmsqh4.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=393&fit=crop&dpr=1 600w, https://images.theconversation.com/files/563439/original/file-20231204-23-rmsqh4.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=393&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/563439/original/file-20231204-23-rmsqh4.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=393&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/563439/original/file-20231204-23-rmsqh4.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=494&fit=crop&dpr=1 754w, https://images.theconversation.com/files/563439/original/file-20231204-23-rmsqh4.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=494&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/563439/original/file-20231204-23-rmsqh4.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=494&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">People want to know that organizations can responsibly gather and manage data.</span>
<span class="attribution"><span class="source">(Shutterstock)</span></span>
</figcaption>
</figure>
<h2>Minimum specification requirements</h2>
<p>We also think the min specs can help people know what to expect of responsible data stewards. To support people in using the min specs, we translated them into plain language questions that individuals can pose to the organizations that collect, use or share their data:</p>
<p><strong>Legal</strong></p>
<p>1) What laws, consent forms or other documents give you the authority to collect, use or share data?</p>
<p><strong>Governance</strong></p>
<p>2) Where do you publicly state the purpose behind your data-focused activities?</p>
<p>3) Which committee or group is accountable for important decisions such as who can use data and how they can use it?</p>
<p>4) How do you achieve transparency about your data holdings, data access policies and other information that people want to know about their data?</p>
<p>5) How do you acknowledge and respect <a href="https://www.stateofopendata.od4d.net/chapters/issues/indigenous-data.html">Indigenous Data Sovereignty</a>? </p>
<p>6) What measures are in place to ensure you adapt and respond to new threats and opportunities?</p>
<p><strong>Management</strong></p>
<p>7) What policies, processes and procedures do you have to cover the entire data life cycle from collection through to use, sharing and destruction?</p>
<p>8) How do you address cybersecurity and data protection?</p>
<p>9) How do you identify and manage risks related to data?</p>
<p>10) What data documentation do you have to help people understand the data you hold?</p>
<p><strong>Data users</strong></p>
<p>11) Is there mandatory privacy and security training that data users must complete?</p>
<p>12) What are the consequences if data users do things they are not allowed to do with data?</p>
<p><strong>Stakeholder and public engagement</strong></p>
<p>13) How do you engage with stakeholders such as the organizations that provide you with data and the organizations that use the knowledge you generate?</p>
<p>14) How can members of the public be informed and get involved in the decisions you make about data?</p>
<p>15) What special measures do you have to engage and involve groups who have a special interest in your activities or decisions?</p>
<h2>Transparent and trustworthy</h2>
<p>These min spec questions can serve as a framework to improve data governance and management practices.</p>
<p>It is our hope that the more that members of the public request this kind of information, the more that organizations will proactively make it available or adapt their practices.</p>
<p>In this way, the min specs can help increase the transparency and trustworthiness of data holding organizations, which can, in turn, lead to more support for data being shared and used for public benefit.</p><img src="https://counter.theconversation.com/content/217422/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>P. Alison Paprica has received funding from the Canadian Institutes of Health Research and other national and provincial research funders in Canada. </span></em></p><p class="fine-print"><em><span>Amy Hawn Nelson receives funding from Robert Wood Johnson Foundation, Annie E. Casey Foundation, the Ford Foundation and the Walton Family Foundation. </span></em></p><p class="fine-print"><em><span>Donna Curtis Maillet receives funding from the Canadian Institutes of Health Research and other national and provincial research funders in Canada. </span></em></p><p class="fine-print"><em><span>Kimberlyn McGrail receives funding from the Canadian Institutes of Health Research and other national and provincial research funders in Canada.</span></em></p><p class="fine-print"><em><span>Michael J. Schull receives funding from the Canadian Institutes of Health Research and the Government of Ontario.</span></em></p>Responsible data stewardship must take many factors into account including legal requirements, data governance, cybersecurity and user privacy.P. Alison Paprica, Professor (adjunct) and Senior Fellow, Institute for Health Policy, Management and Evaluation, Dalla Lana School of Public Health, University of TorontoAmy Hawn Nelson, Research Faculty, Actionable Intelligence for Social Policy (AISP), University of PennsylvaniaDonna Curtis Maillet, Privacy Officer, New Brunswick Institute for Research, Data and Training, Research associate, Faculty of Law, University of New BrunswickKimberlyn McGrail, Professor of Health Services and Policy Research, University of British ColumbiaMichael J. Schull, Professor, Department of Medicine, University of TorontoLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2183552023-11-24T03:40:25Z2023-11-24T03:40:25Z7 charts on family, domestic and sexual violence in Australia<p>With so much data released about family, domestic and sexual violence, it can be difficult to see how it all fits together. </p>
<p>The Australian Institute of Health and Welfare (AIHW) has attempted to do this with a <a href="https://aihw.gov.au/family-domestic-and-sexual-violence">new website</a> that tells the story of violence using numbers, looking at how often it happens, to whom and when. </p>
<p>Here are seven charts that show the prevalence of various forms of interpersonal violence, across life.</p>
<h2>1. Sexual violence risk varies (in ways you might not expect)</h2>
<p>One in five women and one in 16 men have experienced sexual violence since the age of 15.</p>
<p>The likelihood of experiencing sexual violence differs by age as well as gender.</p>
<p><iframe id="2wLeq" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/2wLeq/2/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<p>This chart uses data about recorded crimes. Of course, we know many sexual crimes in childhood and adulthood are never discovered or reported. For each age group, and for both females and males, the recorded crime rate for sexual victimisation has steadily risen from 2010 to 2022. But the rate for girls and boys is substantially higher than for women and men.</p>
<h2>2. What kinds of harm come to the attention of child protection services?</h2>
<p>In cases reported to a statutory child protection service, a “substantiation” is the conclusion, following an investigation, that there was reasonable cause to believe that a child had been, was being, or was likely to be, abused, neglected or otherwise harmed. For both boys and girls, more than half of these cases are about harm from emotional abuse. This refers to parental behaviour, repeated over time, that conveys to a child that they are worthless, unloved or unwanted.</p>
<p><iframe id="zuhJi" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/zuhJi/1/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<p>Witnessing family and domestic violence is not monitored separately as a type of harm in any <a href="https://aifs.gov.au/resources/policy-and-practice-papers/what-child-abuse-and-neglect">state or territory child protection statistics</a>. Therefore it is not one of the primary harm types recorded in the data shown in this graph. Yet in <a href="https://www.acms.au">our study</a>, my colleagues and I found it was the most frequently experienced form of maltreatment in childhood – 39.6% of adults were exposed to domestic violence as children. </p>
<h2>3. Lifetime exposure to violence</h2>
<p>One in three men experienced violence from a stranger, but for women, they were much more likely to experience violence from those they knew.</p>
<p><iframe id="j4lw5" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/j4lw5/1/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<p>One in six women (and one in 13 men) have experienced domestic violence in the form of economic abuse by a current or previous cohabiting partner since the age of 15. </p>
<h2>4. Time is of the essence</h2>
<p>Not only does the risk of experiencing violence change across life, but temporal factors also play a role. Towards the end of the year, when there are festivities and more opportunities for alcohol misuse, the risks are greater.</p>
<p><iframe id="ynq2C" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/ynq2C/1/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<h2>5. Men’s (and boys’) violence towards women and girls</h2>
<p>Perpetrators of violence are more likely to be known to the victim than be a stranger. Some forms of violence, particularly sexual violence, are more likely to be experienced by girls and women. Boys and men are more likely to use violence, again particularly for sexual violence.</p>
<p>One in six women (and one in 18 men) have experienced physical or sexual violence by a current or previous cohabiting partner since the age of 15. </p>
<p><iframe id="DipvY" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/DipvY/2/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<p>One of the types of violence is also emotional. One in four women (and one in seven men) have experienced emotional abuse by <a href="https://www.aihw.gov.au/family-domestic-and-sexual-violence/understanding-fdsv/who-uses-violence">a current or previous cohabiting</a> partner since the age of 15.</p>
<h2>6. Sexual harassment: who does it and who is subjected to it?</h2>
<p>Women are much more likely to be subjected to sexualised behaviours – by men – that are unwanted or make them feel uncomfortable. Overall, rates appear to have declined since 2005, when almost one in five women experienced harassment.</p>
<p><iframe id="8pPnI" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/8pPnI/1/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<h2>7. Sexual victimisation rates have changed over time</h2>
<p>Crime data on sexual victimisation (sexual assaults recorded by police) from 2010 to 2022 suggests things have not been improving. Although there is variability between states, the biggest difference can be seen between women and men (women are at substantially higher risk of sexual victimisation).</p>
<p><iframe id="N0l0g" class="tc-infographic-datawrapper" src="https://datawrapper.dwcdn.net/N0l0g/1/" height="400px" width="100%" style="border: none" frameborder="0"></iframe></p>
<h2>What’s missing?</h2>
<p>Often, people are exposed to multiple kinds of violence. In <a href="https://www.acms.au">our study</a>, we found almost 40% of the population had experienced more than one type of child abuse or neglect – including exposure to family or domestic violence as a child.</p>
<p>We also found this “multi-type maltreatment” was one of the <a href="https://www.acms.au/resources/the-prevalence-and-impact-of-child-maltreatment-in-australia-findings-from-the-australian-child-maltreatment-study-2023-brief-report/">strongest predictors</a> of experiencing mental illness and engaging in behaviours that put health at risk, like cannabis dependence in adulthood.</p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/major-study-reveals-two-thirds-of-people-who-suffer-childhood-maltreatment-suffer-more-than-one-kind-202033">Major study reveals two-thirds of people who suffer childhood maltreatment suffer more than one kind</a>
</strong>
</em>
</p>
<hr>
<p>However, many of the sources of data the AIHW uses only look at one form of violence. So it is much harder to tell the story of how it relates to the impacts that might be observed. </p>
<p>We also can’t see data on children’s exposure to physical punishment in the home, despite Australia’s failure to meet its responsibility under the <a href="https://docstore.ohchr.org/SelfServices/FilesHandler.ashx?enc=6QkG1d%2FPPRiCAqhKb7yhsk5X2w65LgiRF%2FS3dwPS4NWFNCtCrUn3lRntjFl1P2gZpa035aKkorCHAPJx8bIZmDed5owOGcbWFeosUSgDTFKNqA7hBC3KiwAm8SBo665E">UN Convention on the Rights of the Child</a> to protect them from this form of violence.</p>
<p>The data curated on this new website can be used to identify where more services might be required to address the needs of victims of different kinds of violence, at different stages across life. It can also help drive a genuine strategy for <a href="https://www.napcan.org.au/national-summit-to-prevent-child-maltreatment/">prevention</a>. The strategy should look at the risk factors for each type of interpersonal violence, and those that are common across different types of violence. <a href="https://doi.org/10.5694/mja2.51868">Such risks include</a> parental mental illness, substance misuse, poverty and divorce.</p>
<p>And then we must invest in <a href="https://rdcu.be/cEvhu">evidence-based strategies</a> to alleviate the risk of growing up with, and being exposed in adulthood to family, domestic, and sexual violence.</p><img src="https://counter.theconversation.com/content/218355/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Daryl Higgins receives funding from the Australian Research Council, the National Health and Medical Research Council, and a range of government departments and non-government child/family welfare agencies.</span></em></p>Key findings on victims and perpetrators of interpersonal violence have been brought together in a new website that seeks to combine over 30 sources of data across Australia.Daryl Higgins, Professor & Director, Institute of Child Protection Studies, Australian Catholic UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2086272023-09-07T12:23:45Z2023-09-07T12:23:45ZIRS is using $60B funding boost to ramp up use of technology to collect taxes − not just hiring more enforcement agents<figure><img src="https://images.theconversation.com/files/546509/original/file-20230905-27-mt9e4u.jpg?ixlib=rb-1.1.0&rect=695%2C396%2C5177%2C3407&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">The IRS has relied on technology for decades, as this 1965 photo taken in its Philadelphia office shows.</span> <span class="attribution"><a class="source" href="https://www.gettyimages.com/detail/news-photo/view-of-employees-in-the-computer-room-of-a-regional-irs-news-photo/926365314?adppopup=true">US News & World Report Collection/Marion S Trikosko/PhotoQuest via Getty Images</a></span></figcaption></figure><p>The Internal Revenue Service is getting a funding boost thanks to the <a href="https://www.irs.gov/inflation-reduction-act-of-2022">Inflation Reduction Act</a>, which President Joe Biden signed into law in 2022.</p>
<p>That legislative package originally included about US$80 billion to expand the tax collection agency’s budget over the next 10 years. Congress and the White House have since agreed to <a href="https://www.politico.com/newsletters/weekly-tax/2023/07/31/irs-funding-battles-loom-large-in-september-00108882">pare this total by about $20 billion</a>, but $60 billion is still a big chunk of change for an agency that until recently had <a href="https://www.irs.gov/statistics/irs-budget-and-workforce">about $14 billion in annual funding</a>. </p>
<p>I’m a <a href="https://scholar.google.com/citations?user=9hhC4q8AAAAJ&hl=en&oi=ao">tax researcher</a> who studies how the <a href="https://doi.org/10.2308/AAHJ-2022-014">IRS uses technology</a> and how taxpayers respond to the agency’s growing reliance on it. While the number of IRS enforcement personnel will surely grow as a result of additional funding, I think that the agency can get more mileage out of emphasizing technological improvements.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/546514/original/file-20230905-19-5ho46d.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="Three men in suits and Janet Yellen stand around a computer and a sign on the wall reading 'digital intake center.'" src="https://images.theconversation.com/files/546514/original/file-20230905-19-5ho46d.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/546514/original/file-20230905-19-5ho46d.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/546514/original/file-20230905-19-5ho46d.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/546514/original/file-20230905-19-5ho46d.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/546514/original/file-20230905-19-5ho46d.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/546514/original/file-20230905-19-5ho46d.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/546514/original/file-20230905-19-5ho46d.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Director of Enterprise Digitalization of 22nd Century Technologies Harrison Smith, left, demonstrates the digital intake initiative, a scanning technology for IRS paperless processing, as Secretary of the Treasury Janet Yellen and IRS Commissioner Daniel Werfel, right, look on in August 2023.</span>
<span class="attribution"><a class="source" href="https://www.gettyimages.com/detail/news-photo/director-of-enterprise-digitalization-of-22nd-century-news-photo/1588274665?adppopup=true">Alex Wong/Getty Images</a></span>
</figcaption>
</figure>
<h2>Making enforcement more efficient</h2>
<p>The IRS plans to use most of these new funds to step up <a href="https://crsreports.congress.gov/product/pdf/IN/IN11977">enforcement and improve customer service</a> for taxpayers. </p>
<p>There’s been <a href="https://www.nytimes.com/2022/08/19/us/politics/more-money-for-irs-spurs-conspiracy-theories-of-shadow-army.html">plenty of conjecture</a> about what the added enforcement will look like and <a href="https://www.theguardian.com/commentisfree/2022/sep/08/republicans-irs-shadow-army-fearmongering">no shortage of fearmongering</a> about the tens of thousands of <a href="https://apnews.com/article/fact-check-irs-agents-inflation-reduction-act-871970314297">new agents the IRS might hire</a>.</p>
<p>Often left out of this discussion is the fact that the agency’s staffing <a href="https://www.cbpp.org/research/federal-tax/the-need-to-rebuild-the-depleted-irs">was cut by 22% between 2010 and 2021</a>. Much of the agency’s hiring spree will replace these labor shortages rather than fill new posts. Further, the IRS expects <a href="https://www.reuters.com/world/us/republicans-call-it-an-army-irs-hires-will-replace-retirees-do-it-says-treasury-2022-08-19/">over 50,000 of its employees to retire within five years</a>.</p>
<p>The agency aims to hire <a href="https://www.reuters.com/world/us/us-irs-hire-30000-staff-over-two-years-it-deploys-80-bln-new-funding-2023-04-06/">20,000 people over the next two years</a>, of which one-third will work in enforcement.</p>
<p>But IRS Commissioner Daniel Werfel has indicated that better enforcement won’t just rely on more tax agents and auditors. He <a href="https://www.irs.gov/pub/irs-pdf/p3744.pdf">released a plan in early 2023 promising</a> that “technology and data advances will allow us to focus enforcement on taxpayers trying to avoid taxes, rather than taxpayers trying to pay what they owe.”</p>
<p>And <a href="https://www.reuters.com/world/us/us-irs-hire-30000-staff-over-two-years-it-deploys-80-bln-new-funding-2023-04-06/">U.S. Deputy Treasury Secretary Wally Adeyemo</a> has said that “the IRS is going to hire more data scientists than they ever have for enforcement purposes,” with the goal of using data analytics in audits.</p>
<p>At least initially, the agency was aiming to <a href="https://taxfoundation.org/irs-funding-plan-inflation-reduction-act/">increase its spending on enforcement by 69%</a>, from about <a href="https://crsreports.congress.gov/product/pdf/IN/IN11977">$6.6 billion in 2022 to $11 billion in annual spending projected through 2031</a>.</p>
<p>Technology, including the electronic filing of tax returns and a growing portfolio of online tools, transfers work from agents to computers. Online tools include <a href="https://www.irs.gov/newsroom/irs-begins-new-digital-intake-initiative-form-940-scanning-process-off-to-strong-start-other-forms-to-start-soon">the IRS’ digital scanning program</a>, which expedites the processing of the roughly 1 in 5 federal tax <a href="https://www.irs.gov/statistics/returns-filed-taxes-collected-and-refunds-issued">returns that weren’t filed electronically in 2022</a>. </p>
<p><a href="https://www.irs.gov/pub/irs-pdf/p3744.pdf">Werfel says</a> the IRS workforce is becoming more efficient by ramping up its reliance on technology to provide <a href="https://www.irs.gov/newsroom/irs-modernization-plan-provides-plan-to-improve-services-for-taxpayers-tax-community">services for taxpayers</a> and <a href="https://www.bloomberg.com/opinion/articles/2023-03-09/ai-can-help-the-irs-catch-wealthy-tax-cheats#xj4y7vzkg">spot tax cheats</a>.</p>
<p>The IRS has tapped one form of data analytics or another to select people and companies to audit <a href="https://www.gao.gov/products/100316">since the late 1960s</a>. As early as 1986, it had researched ways to <a href="https://www.thefreelibrary.com/IRS+artificial+intelligence+projects+(close+encounters+of+an+AI+kind).-a012740196">use artificial intelligence</a> to improve how it selects its auditing targets.</p>
<p>At the same time, outdated technology is hampering the Internal Revenue Service’s effectiveness. It <a href="https://www.gao.gov/blog/irss-efforts-modernize-60-year-old-tax-processing-system-almost-decade-away">relies on a 60-year-old computer system</a> to maintain and process data. That undercuts its technological agility and <a href="https://www.cnn.com/2023/01/11/politics/republican-irs-funding-87000-agents/index.html">customer service</a>.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/546511/original/file-20230905-9214-ndfr8i.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="A 2014 1040 U.S. tax form displayed on a laptop" src="https://images.theconversation.com/files/546511/original/file-20230905-9214-ndfr8i.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/546511/original/file-20230905-9214-ndfr8i.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/546511/original/file-20230905-9214-ndfr8i.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/546511/original/file-20230905-9214-ndfr8i.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/546511/original/file-20230905-9214-ndfr8i.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/546511/original/file-20230905-9214-ndfr8i.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/546511/original/file-20230905-9214-ndfr8i.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Nearly all taxpayers have been filing electronically for years.</span>
<span class="attribution"><a class="source" href="https://www.gettyimages.com/detail/news-photo/view-of-an-irs-1040-tax-form-on-a-laptop-computer-screen-news-photo/550448599?adppopup=true">Robert Barnes/Getty Images</a></span>
</figcaption>
</figure>
<h2>3 sources of data</h2>
<p>When the IRS collects better data, its ability to use <a href="https://www.tx.cpa/docs/default-source/communications/2019-today%27s-cpa/january-february/taxtopics-irs-bigdata-jan-feb2019-today%27scpa.pdf?sfvrsn=a165f2b1_4">data analytics to make predictions about noncompliance</a> improves.</p>
<p>Beyond data reported on tax forms themselves, like 1099s, the IRS has three main sources of data it assesses to learn more about taxpayers. </p>
<p><strong>1. Past tax returns</strong></p>
<p>The IRS’s National Research Program collects data to support what it calls “<a href="https://www.irs.gov/irm/part4/irm_04-022-001">strategic decisions</a>” to better enforce compliance. </p>
<p>The program first relies on its vast stores of taxpayer data, <a href="https://www.irs.gov/businesses/small-businesses-self-employed/irs-audits#:%7E:text=Selection%20for%20an%20audit%20does,%22norms%22%20for%20similar%20returns">including prior audit results</a>, to develop an expectation of what a given tax return may include, like a tuition tax credit for a taxpayer with a history of claiming the child tax credit. Filed returns are compared against those standards to identify potential outliers. Outliers aren’t necessarily dodging taxes or misrepresenting their tax liabilities, but big departures from the norms can indicate a higher likelihood of mistakes or evasion. </p>
<p><strong>2. Publicly available data</strong></p>
<p>The IRS relies on publicly available data associated with each tax return when <a href="https://www.irs.gov/irm/part4/irm_04-022-001">it’s building a case</a> for an audit. </p>
<p>The data, which is available to anyone who wants to find it, has <a href="https://www.gao.gov/products/gao-22-106096">increased tremendously</a> with the rise of social media and the growing role of the internet for commerce and advertising. A social media presence can alert the IRS to a business with potential income in a way that the agency could not have identified before the internet emerged.</p>
<p>This includes methods that might surprise you.</p>
<p>As far back as 2010, for example, IRS training materials instructed agents to use a band’s social networking sites to compare musicians’ reported income with their likely <a href="https://www.computerworld.com/article/2756433/irs--doj-use-social-media-sites-to-track-deadbeats--criminal-activity.html">income from their past performances</a>. IRS training materials instruct agents to predict musicians’ gig income based on the number of shows a band advertises through its social media posts.</p>
<p>People make all sorts of financial information public today, including their side hustles and Venmo ledgers. The IRS can access and use this data like anyone else. </p>
<p><strong>3. Third-party data</strong></p>
<p>The IRS can also buy data.</p>
<p>For example, a 2020 government contract with the company Chainalysis is described, perhaps clumsily, as a contract for “<a href="https://www.usaspending.gov/award/CONT_AWD_2032H820C00041_2050_-NONE-_-NONE-">pilot IRS cryptocurrency tracing</a>.” This type of contract gives the IRS information related to otherwise untraceable income sources so that agents can detect underreporting.</p>
<p>What has changed in recent years is the volume of data it can access, <a href="https://scholarship.law.vanderbilt.edu/cgi/viewcontent.cgi?article=1131&context=jetlaw">which has skyrocketed</a>.</p>
<p>Sometimes, widespread underreporting results in legislation which requires third parties to report income information to the IRS, rather than requiring the agency seek it out. </p>
<p>Recent legislation includes requiring third-party payment agencies like Venmo, PayPal and Uber to issue a 1099 tax form to <a href="https://theconversation.com/you-cant-hide-side-hustles-from-the-irs-anymore-heres-what-taxpayers-need-to-know-about-reporting-online-payments-for-gig-work-199952">anyone making over $600 on the app in one year</a>. These 1099s are issued to taxpayers – and the IRS.</p>
<p>Similar legislation was recently proposed for <a href="https://www.cnbc.com/2023/08/25/biden-administration-unveils-new-crypto-tax-reporting-rules.html">cryptocurrency transactions</a>. </p>
<h2>What might change</h2>
<p>What does this increase in IRS spending on technology mean for taxpayers? </p>
<p>When the <a href="https://www.irs.gov/pub/irs-pdf/p3744.pdf">IRS detailed how it wanted to use the new funds</a> in April 2023, it emphasized improving taxpayers’ experiences and increasing compliance. By using <a href="https://www.irs.gov/about-irs/using-voice-and-chat-bots-to-improve-the-collection-taxpayer-experience">chatbots to respond to taxpayer questions</a>, providing online portals for real-time processing, and letting taxpayers <a href="https://home.treasury.gov/system/files/136/IRStechnologySOPOnePager.pdf">respond to notices online</a>, the IRS could substantially decrease the time taxpayers spend corresponding with the agency or waiting on hold while attempting to speak to a staffer.</p>
<p>Technology-boosted enforcement could help the agency <a href="https://www.irs.gov/pub/irs-pdf/p3744.pdf">collect more revenue to fund government programs</a>. </p>
<p>And the agency also hopes to use data to make paying taxes less onerous for the majority of Americans who follow the rules.</p>
<p>For example, when a taxpayer has a child or experiences another kind of life change that will change their tax status, the IRS wants to gain the ability to proactively notify people about the consequences – whether it’s <a href="https://www.irs.gov/pub/irs-pdf/p3744.pdf">paying more, owing less or getting a new tax credit</a>. </p>
<p>Most people want to pay what they owe, no more and no less. I believe the IRS intends to make good use of its new funding to help people do just that.</p><img src="https://counter.theconversation.com/content/208627/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Erica Neuman does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>The agency hopes to make paying taxes less onerous for the majority of Americans who follow the rules.Erica Neuman, Assistant Professor of Accounting, University of DaytonLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2110802023-08-21T12:24:30Z2023-08-21T12:24:30ZAI and new standards promise to make scientific data more useful by making it reusable and accessible<figure><img src="https://images.theconversation.com/files/542679/original/file-20230814-20-nwyn4.jpg?ixlib=rb-1.1.0&rect=44%2C35%2C5946%2C3961&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Data replication is an integral part of the scientific process, which proper research data management can improve. </span> <span class="attribution"><a class="source" href="https://www.gettyimages.com/detail/photo/university-professor-gesturing-towards-whiteboard-royalty-free-image/1447684664?phrase=data&adppopup=true">Tom Werner/DigitalVision via Getty Images</a></span></figcaption></figure><p>Every time a scientist runs an experiment, or a social scientist does a survey, or a humanities scholar analyzes a text, they generate data. Science runs on data – without it, we wouldn’t have the James Webb Space Telescope’s <a href="https://webbtelescope.org/news/first-images/gallery">stunning images</a>, disease-preventing <a href="https://www.cdc.gov/coronavirus/2019-ncov/vaccines/index.html">vaccines</a> or an evolutionary tree that <a href="https://www.gbif.org/">traces the lineages</a> of all life.</p>
<p>This scholarship generates an unimaginable amount of data – so how do researchers keep track of it? And how do they make sure that it’s accessible for use by both humans and machines?</p>
<p>To improve and advance science, scientists need to <a href="https://www.nature.com/articles/d41586-021-02486-7">be able to reproduce</a> others’ data or combine data from multiple sources to learn something new.</p>
<figure>
<iframe width="440" height="260" src="https://www.youtube.com/embed/FpCrY7x5nEE?wmode=transparent&start=0" frameborder="0" allowfullscreen=""></iframe>
<figcaption><span class="caption">Accessible and usable data can help scientists reproduce prior results. Doing so is an important part of the scientific process, as this TED-Ed video explains.</span></figcaption>
</figure>
<p>Any kind of sharing requires management. If your neighbor needs to borrow a tool or an ingredient, you have to know whether you have it and where you keep it. Research data might be on a graduate student’s laptop, buried in a professor’s USB collection or saved more permanently within an online data repository.</p>
<p>I’m an <a href="https://bradleywadebishop.github.io/website/">information scientist</a> who studies other scientists. More precisely, I study how scientists think about research data and the ways that they interact with their own data and data from others. I also teach students how to manage their own or others’ data in ways that advance knowledge.</p>
<h2>Research data management</h2>
<p><a href="https://www.oclc.org/research/areas/research-collections/rdm.html">Research data management</a> is an area of scholarship that focuses on data discovery and reuse. As a field, it encompasses research data services, resources and cyberinfrastructure. For example, one type of infrastructure, the <a href="https://www.re3data.org/">data repository</a>, gives researchers a place to deposit their data for long-term storage so that others can find it. In short, research data management encompasses the data’s life cycle from cradle to grave to reincarnation in the next study. </p>
<p>Proper research data management also allows scientists to use the data already out there rather than recollecting data that already exists, which saves time and resources. </p>
<p>With <a href="https://doi.org/10.1016/bs.pmbts.2021.10.002">increasing science politicization</a>, many national and international science organizations have upped their <a href="https://nap.nationalacademies.org/catalog/25303/reproducibility-and-replicability-in-science">standards for accountability and transparency</a>. <a href="https://www.whitehouse.gov/ostp/news-updates/2023/01/11/fact-sheet-biden-harris-administration-announces-new-actions-to-advance-open-and-equitable-research/">Federal agencies</a> and other major research funders like the <a href="https://sharing.nih.gov/data-management-and-sharing-policy">National Institutes of Health</a> now prioritize research data management and require researchers to have a data management plan before they can receive any funds.</p>
<p>Scientists and data managers can work together to redesign the systems scientists use to make data discovery and preservation easier. In particular, <a href="https://doi.org/10.1016/j.ejmp.2021.01.083">integrating AI</a> can make this data more accessible and reusable.</p>
<h2>Artificially intelligent data management</h2>
<p>Many of these new standards for research data management also stem from an increased use of AI, including machine learning, across <a href="https://doi.org/10.3138/jelis-2021-0023">data-driven fields</a>. AI makes it highly desirable for any data to be machine-actionable – that is, usable by machines without human intervention. Now, scholars can consider machines not only as tools but also as potential autonomous data reusers and collaborators.</p>
<p>The key to machine-actionable data is metadata. <a href="https://www.dublincore.org/specifications/dublin-core/dces/">Metadata</a> are the descriptions scientists set for their data and may include elements such as creator, date, coverage and subject. Minimal metadata is minimally useful, but correct and complete standardized metadata makes data more useful for both people and machines.</p>
<p>It takes a cadre of research data managers and librarians to make machine-actionable data a reality. These <a href="https://doi.org/10.3138/jelis-2021-0023">information professionals</a> work to facilitate communication between scientists and systems by ensuring the quality, completeness and consistency of shared data.</p>
<p>The <a href="https://www.go-fair.org/fair-principles/">FAIR data principles</a>, created by a group of researchers called <a href="https://force11.org/">FORCE11</a> in 2016 and used across the world, provide guidance on how to enable data reuse by machines and humans. FAIR data is findable, accessible, interoperable and reusable – meaning it has robust and complete metadata. </p>
<p>In the past, I’ve studied <a href="https://doi.org/10.1002/pra2.4">how scientists discover and reuse data</a>. I found that scientists tend to use mental shortcuts when they’re looking for data – for example, they may go back to familiar and trusted sources or search for certain key terms they’ve used before. Ideally, my team could build this decision-making process of experts and remove as many biases as possible to improve AI. The automation of these mental shortcuts should reduce the time-consuming chore of locating the right data.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/542885/original/file-20230815-19-pa41n9.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="A stack of papers and folders" src="https://images.theconversation.com/files/542885/original/file-20230815-19-pa41n9.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/542885/original/file-20230815-19-pa41n9.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=450&fit=crop&dpr=1 600w, https://images.theconversation.com/files/542885/original/file-20230815-19-pa41n9.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=450&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/542885/original/file-20230815-19-pa41n9.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=450&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/542885/original/file-20230815-19-pa41n9.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=566&fit=crop&dpr=1 754w, https://images.theconversation.com/files/542885/original/file-20230815-19-pa41n9.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=566&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/542885/original/file-20230815-19-pa41n9.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=566&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">FAIR data should be machine-actionable, meaning digital and complete with comprehensive metadata. Many librarians have worked to digitize historical data, which may be hard copy, and make it FAIR.</span>
<span class="attribution"><a class="source" href="https://www.gettyimages.com/detail/photo/many-books-stacked-on-top-of-cabinets-group-of-royalty-free-image/1402090075?phrase=old+file+cabinet&adppopup=true">Penpak Ngamsathain/Moment via Getty Images</a></span>
</figcaption>
</figure>
<h2>Data management plans</h2>
<p>But there’s still one piece of research data management that AI can’t take over. <a href="http://doi.org/10.5334/dsj-2023-002">Data management plans</a> describe the what, where, when, why and who of managing research data. Scientists fill them out, and they outline the roles and activities for managing research data during and long after research ends. They answer questions like, “Who is responsible for long-term preservation,” “Where will the data live,” “How do I keep my data secure,” and “Who pays for all of that?” </p>
<p>Grant proposals for nearly all funding agencies across countries <a href="https://theconversation.com/new-data-sharing-requirements-from-the-national-institutes-of-health-are-a-big-step-toward-more-open-science-and-potentially-higher-quality-research-178869">now require data management plans</a>. These plans signal to scientists that their data is valuable and important enough to the community to share. Also, the plans help funding agencies keep tabs on the research and <a href="https://doi.org/10.29173/istl2602">investigate any potential misconduct</a>. But most importantly, they help scientists make sure their data stays accessible for many years.</p>
<p>Making all research data as FAIR and open as possible will improve the scientific process. And having access to more data opens up the possibility for more informed discussions on <a href="https://www.fgdc.gov/gda">how to promote</a> economic development, improve the stewardship of natural resources, enhance public health, and how to responsibly and ethically develop technologies that will improve lives. All intelligence, artificial or otherwise, will benefit from better organization, access and use of research data.</p><img src="https://counter.theconversation.com/content/211080/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Bradley Wade Bishop receives funding from the Institute of Museum and Library Services. </span></em></p>The phrase ‘research data management’ might make your eyes glaze over, but it’s actually this behind-the-scenes work that allows for large-scale scientific discoveries and collaborations.Bradley Wade Bishop, Professor of Information Sciences, University of TennesseeLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2115772023-08-15T19:31:37Z2023-08-15T19:31:37ZZoom’s scrapped proposal to mine user data causes concern about our virtual and private Indigenous Knowledge<figure><img src="https://images.theconversation.com/files/542670/original/file-20230814-17-i0xdd7.jpg?ixlib=rb-1.1.0&rect=131%2C108%2C5044%2C3771&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Platforms like Zoom have been helpful in bridging geographical distances. However, a recent proposal to mine data raises questions about ownership of Indigenous Knowledge. </span> <span class="attribution"><span class="source">(Chris Montgomery/Unsplash)</span></span></figcaption></figure><iframe style="width: 100%; height: 100px; border: none; position: relative; z-index: 1;" allowtransparency="" allow="clipboard-read; clipboard-write" src="https://narrations.ad-auris.com/widget/the-conversation-canada/zooms-scrapped-proposal-to-mine-user-data-causes-concern-about-our-virtual-and-private-indigenous-knowledge" width="100%" height="400"></iframe>
<p>As reported on Aug. 6, <a href="https://stackdiary.com/zoom-terms-now-allow-training-ai-on-user-content-with-no-opt-out/">Zoom recently attempted to rewrite its Terms of Service with ambiguous language that would permit the extraction of user data for the purpose of training AI</a>. </p>
<p>However, after <a href="https://variety.com/2023/digital/news/zoom-response-customer-content-ai-training-1235689725/">public pushback</a>, <a href="https://blog.zoom.us/zooms-term-service-ai/">Zoom began to rectify that clause the very next day</a>, fully committing to a “no AI training” set of policies by Aug. 11. </p>
<p>Even though Zoom pedalled back this time, their drive to gather data highlights the possibility of future hidden data extraction by them and other big tech companies.</p>
<p>More specifically, as a researcher working with and looking at Indigenous communities and their data, I am concerned about the privacy of these valuable data sets from Indigenous communities on Turtle Island. </p>
<h2>Vulnerable Indigenous Knowledge</h2>
<p>Over the past three years, Zoom calls have become a tool for organization and activism for many Indigenous communities. </p>
<p>For my own work, I use video and voice chat which lets us balance geographical differences to collaborate and share, as well as access communities that are hard to reach. Discussing issues with queer community members of different Indigenous Nations is often private and perhaps even sacred. </p>
<p>These conversations have elements that are public facing, but they also contain wisdom from Elders or Knowledge Keepers specifically trained to know what they can and cannot share in specific spaces. Some of this knowledge is sacred and is part of promoting and preserving Indigenous (and sometimes queer) ways of being. </p>
<h2>A valuable commodity</h2>
<p>This private information is constantly at risk of extraction from companies seeking to monetize or otherwise gain from our data.</p>
<p>Indigenous Knowledge represents a large gap in current big data. AI only works with large data sets which enables predictive technology to operate. </p>
<p>With knowledges that are primarily oral, it is difficult to gather proper data sets that often come from writing. The possibility for big companies to gather audio and visual data, could render this oral information visible by machines. </p>
<h2>Protecting communities</h2>
<p><a href="https://static1.squarespace.com/static/557744ffe4b013bae3b7af63/t/557f2ee5e4b0220eff4ae4b5/1434398437409/Tuck+and+Yang+R+Words_Refusing+Research.pdf">“Refusing research”</a> has been an important concept for protecting marginalized communities from the extractive practices of researchers aiming to obtain data.</p>
<p>However, if platforms are extracting data without our knowledge, or demand our consent in order to use a service, a conflict emerges.</p>
<p>The conflict becomes one of <a href="https://techcrunch.com/2023/08/08/zoom-data-mining-for-ai-terms-gdpr-eprivacy/">free choice versus free-to-leave</a>: If we do not consent to use the infrastructure, we simply do not get access to that service. Access to voice and video sharing infrastructure has been a fundamental component of activism and community research, especially post COVID-19.</p>
<h2>Can we ‘opt-out?’</h2>
<p>Can we accept or refuse to be turned into research data?</p>
<p>Even though there is a permissions element, organizations are often gathering our data in exchange for using their services. For example, Fitbit <a href="https://www.fitbit.com/global/en-ca/legal/privacy-policy">gathers massive amounts of health data from users (with permission) that can be used to train AI</a>. </p>
<p>Each individual who is opting for nearly any big service is being tracked to some capacity. And so, there needs to be a critical element of what is considered private. </p>
<p>Likewise, Zoom has the ability to gather this data, whether or not they use it for AI with consent. There is an anxiety that next time, the ambiguity will go unnoticed or perhaps force consent to access a seemingly necessary service.</p>
<p>As someone who looks at ethical data collection and mobilization, I believe we all need to be critical of those requests to have access to our private data when using these services.</p>
<figure class="align-center ">
<img alt="A woman looks at a computer screen monitor" src="https://images.theconversation.com/files/542709/original/file-20230815-27813-p3choq.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/542709/original/file-20230815-27813-p3choq.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/542709/original/file-20230815-27813-p3choq.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/542709/original/file-20230815-27813-p3choq.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/542709/original/file-20230815-27813-p3choq.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/542709/original/file-20230815-27813-p3choq.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/542709/original/file-20230815-27813-p3choq.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">In the future, will we find ourselves agreeing to give up our data just to use video platform software like Zoom?</span>
<span class="attribution"><span class="source">(Unsplash)</span></span>
</figcaption>
</figure>
<h2>Crucial access to data</h2>
<p>The relationship between data and Indigenous communities and the Canadian government has always been fraught. However, after the work of the <a href="https://www.rcaanc-cirnac.gc.ca/eng/1450124405592/1529106060525">Truth and Reconciliation Commission</a> in Canada (which concluded in 2015), it became even more clear that access to data and information is crucial to achieving justice and truth in relation to our histories.</p>
<p>For Indigenous peoples whose history has been systematically erased, demanding that organizations return records and data has become an important element of achieving the truth behind the experiences of Indian Residential School survivors. Communities have both the desire and need to have their data returned so that they can maintain <a href="https://fnigc.ca/ocap-training">ownership, control, access and manage permissions</a> to access information.</p>
<h2>Ease of Zoom for communication</h2>
<p>In-person collaboration between Indigenous communities can be difficult because of things like geographical differences, the lack of public transportation, and interruptions in Indigenous sovereignty. These issues continue the social and political fragmentation caused by settler colonialism to isolate these communities from one another.</p>
<p>Many of these challenges have been alleviated by information technologies like Zoom. And a platform like Zoom has been potentially unifying by bridging space. However, it could also become a tool to recreate the problem of data extraction in a new way. </p>
<p>We need to be attentive to these kinds of data gathering possibilities that offer to extract data from users. </p>
<p>These technological infrastructures may disproportionately harm Indigenous communities by making their private and sacred knowledges legible by AI. Data collection for AI could lead to the commodification of this sacred knowledge for profit. </p>
<p>Protecting this kind of data is not just the responsibility of Indigenous communities but a shared commitment that has a present and future urgency.</p><img src="https://counter.theconversation.com/content/211577/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Andrew Wiebe does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>In-person collaboration between Indigenous communities has been aided by information technologies like Zoom. However, recent attempts to mine personal data raise concerns about data ownership.Andrew Wiebe, PhD Student, Information, University of TorontoLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2091802023-08-02T17:13:54Z2023-08-02T17:13:54ZHow swarming animals can help humans and AI make better decisions<figure><img src="https://images.theconversation.com/files/535929/original/file-20230705-23-hte9mu.jpeg?ixlib=rb-1.1.0&rect=0%2C0%2C6381%2C3444&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Starling murmurations form as daylight fades over their roosting sites. </span> <span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/beautiful-large-flock-starlings-birds-fly-1930366580">Shutterstock / Albert Beukhof</a></span></figcaption></figure><p>The word swarm often carries negative connotations – think biblical plagues of locusts or high streets full of last-minute shoppers during the Christmas rush. However, swarming is essential for the survival of many animal collectives. And now research into swarming has the potential to change things for humans too.</p>
<p>Bees swarm to make their <a href="https://academic.oup.com/aesa/article/97/1/111/11469">search for new colonies</a> more effective. Flocks of starlings use <a href="https://link.springer.com/article/10.1007/s00265-018-2609-0">dazzling murmurations to evade and confuse predators</a>. These are just two examples from nature but swarming can be seen in almost every corner of the animal kingdom. </p>
<p>Research from mathematicians, biologists and social scientists is helping us understand swarming and harness its power. It’s already being used for <a href="https://ieeexplore.ieee.org/abstract/document/4424900">crowd control</a>, <a href="https://ieeexplore.ieee.org/abstract/document/5366981">traffic management</a> and to understand the <a href="https://ts2.space/en/swarm-intelligence-for-public-health-and-epidemiology/">spread of infectious diseases</a>. More recently, it’s starting to shape how we use data for healthcare, operate drones in military conflicts and has been used to beat near-insurmountable betting odds in sporting events.</p>
<p>A swarm is a system that is greater than the sum of its parts. Just as many neurons form a brain capable of thought, memory and emotion, groups of animals can act in unison to form a “super brain”, displaying highly complex behaviour not seen in individual animals. </p>
<figure>
<iframe width="440" height="260" src="https://www.youtube.com/embed/V4f_1_r80RY?wmode=transparent&start=0" frameborder="0" allowfullscreen=""></iframe>
</figure>
<p>Artificial life expert Craig Reynolds revolutionised the study of swarming in 1986 with the publication of the <a href="https://dl.acm.org/doi/10.1145/37401.37406">Boids model</a> computer simulation. The Boids model breaks down swarming into a simple set of rules. </p>
<p>The Boids (bird-oids) in the simulation, like avatars or characters in a video game, are instructed to move in the same direction as their neighbours, move towards the average position of their neighbours, and avoid collisions with other boids. </p>
<p>Boids simulations are strikingly accurate when compared with real swarms. </p>
<figure>
<iframe width="440" height="260" src="https://www.youtube.com/embed/_5tJ8jwd64Y?wmode=transparent&start=0" frameborder="0" allowfullscreen=""></iframe>
</figure>
<p>The Boids model suggests that swarming does not need leaders to coordinate behaviour – like pedestrians in a town centre rather than a guided museum tour. The complex behaviour we see in swarms arises from interactions between individuals following the same simple rules in parallel. In the language of physics, this phenomenon is known as <a href="https://www.sciencedirect.com/science/article/pii/S1476945X07000049?casa_token=6Lr13Hi0yzUAAAAA:eN6wloN9IBvWw5zl_iqVp1lFgyiKGa1P17Uk9QYkVLj6f0-DsFBQ1iFB0MT_YYKSNSi7S2mr">emergence</a>. </p>
<h2>The hive mind</h2>
<p>In 2016, US technology company <a href="https://unanimous.ai/">Unanimous AI</a> used the power of swarm intelligence to <a href="https://unanimous.ai/unu-superfecta-11k/">win the Kentucky Derby “superfecta” bet</a>, successfully predicting the first, second, third and fourth-placed riders in the famous US horse race. </p>
<p><a href="https://www.sbnation.com/2016/5/5/11594904/2016-kentucky-derby-picks-predictions-nyquist-mor-spirit">Industry experts</a> and <a href="https://hothardware.com/news/bing-predicts-kentucky-derby-winner-social-algorithms">conventional machine learning algorithms</a> made swathes of incorrect predictions. However, amateur racing enthusiasts recruited by Unanimous AI pooled their knowledge to beat the <a href="https://bleacherreport.com/articles/2638613-kentucky-derby-results-2016-winner-payouts-highlights-and-order-of-finish">541/1 odds</a>. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/537381/original/file-20230713-14892-fn0yq9.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/537381/original/file-20230713-14892-fn0yq9.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/537381/original/file-20230713-14892-fn0yq9.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=395&fit=crop&dpr=1 600w, https://images.theconversation.com/files/537381/original/file-20230713-14892-fn0yq9.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=395&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/537381/original/file-20230713-14892-fn0yq9.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=395&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/537381/original/file-20230713-14892-fn0yq9.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=496&fit=crop&dpr=1 754w, https://images.theconversation.com/files/537381/original/file-20230713-14892-fn0yq9.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=496&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/537381/original/file-20230713-14892-fn0yq9.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=496&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Hopeful punters bet millions of dollars on the Kentucky Derby each year.</span>
<span class="attribution"><span class="source">Shutterstock / Cheryl Ann Quigley</span></span>
</figcaption>
</figure>
<p>The volunteers’ success lay in the way in which their predictions were generated. Instead of voting on riders and aggregating their choices, the volunteers used <a href="https://unanimous.ai/swarm/">Unanimous AI’s swarm intelligence platform</a> to participate in a real-time digital tug of war, inspired by swarms of birds and bees.</p>
<p>All volunteers simultaneously pulled a dial towards their respective choices. This allowed people to change their preferences in response to the actions of others (for example, a person may have switched to pulling towards their second choice, B, rather than their first choice, C, if they saw A and B were the clear favourites). </p>
<p>Responding to one another in real time allowed Unanimous AI’s volunteers to collectively outperform <a href="https://www.sbnation.com/2016/5/5/11594904/2016-kentucky-derby-picks-predictions-nyquist-mor-spirit">highly-informed individuals</a>. </p>
<p>What’s more, had the most frequent individual picks of the volunteers determined the ordering, only the <a href="https://www.npr.org/sections/thetwo-way/2016/05/07/477171967/nyquist-wins-the-2016-kentucky-derby#:%7E:text=Carr%2FGetty%20Images-,Nyquist%2C%20ridden%20by%20Mario%20Gutierrez%2C%20crosses%20the%20finish%20line%20during,Churchill%20Downs%20on%20May%207.&text=Nearly%20one%20year%20since%20American,his%20own%20at%20Churchill%20Downs.">2016 winner</a> and <a href="https://www.sbnation.com/2016/5/7/11616138/2016-kentucky-derby-odds-post-nyquist-my-man-sam-exaggerator-bet-how">bookies’ favourite</a>, <a href="https://www.racingpost.com/profile/horse/896792/nyquist">Nyquist</a>, would have been placed correctly. </p>
<h2>Health concerns</h2>
<p>Similar swarming technologies are also of increasing interest in the <a href="https://www.nature.com/articles/s41586-021-03583-3">healthcare</a> sector, where <a href="https://www.frontiersin.org/articles/10.3389/fsoc.2022.1038854/full">talk of an AI revolution</a> is prompting <a href="https://digitalcommons.law.scu.edu/chtlj/vol36/iss4/2/">increasing concerns around patient privacy</a>. </p>
<p>As the reliance on <a href="https://link.springer.com/content/pdf/10.1007/s11518-019-5437-5.pdf">data-driven techniques in healthcare</a> increases, so too does the demand for extensive patient datasets. One way to meet these demands is to <a href="https://jamanetwork.com/journals/jama/fullarticle/2768851">pool information between institutions and in some cases, countries</a>. </p>
<p>However, the transfer of patient data is often subject to <a href="https://www.jmir.org/2017/2/e47/">stringent data protection regulations</a>. A solution to this problem is to use only in-house data, though this often comes at the expense of diagnostic accuracy. </p>
<p>An alternative lies in swarming. Researchers believe swarm intelligence can <a href="https://healthcare-in-europe.com/en/news/ai-with-swarm-intelligence-to-analyse-medical-data.html">preserve diagnostic accuracy</a> without the need for raw data exchange between institutions. </p>
<p><a href="https://www.nature.com/articles/s41586-021-03583-3">Preliminary studies</a> have shown decentralising data storage into a network of interacting nodes can give institutions the benefit of shared wisdom. This means there isn’t a central hub coordinating the flow of information, and institutions can’t access the private patient data of each other. </p>
<p>Centralised machine learning uses data uploaded to a shared hub where machine learning takes place using all available data. In decentralised systems, each institution separately stores its data in its own node. The machine learning happens locally at each node (using only in-house data), but the results of machine learning are shared between the network, to the benefit of all nodes. This process ensures that raw patient data is not exchanged between institutions, preserving patient privacy. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/537570/original/file-20230714-29-ahjkkr.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/537570/original/file-20230714-29-ahjkkr.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/537570/original/file-20230714-29-ahjkkr.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/537570/original/file-20230714-29-ahjkkr.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/537570/original/file-20230714-29-ahjkkr.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/537570/original/file-20230714-29-ahjkkr.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/537570/original/file-20230714-29-ahjkkr.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/537570/original/file-20230714-29-ahjkkr.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Swarms of drones may soon populate the battlefield.</span>
<span class="attribution"><span class="source">Shutterstock / Andy Dean Photography</span></span>
</figcaption>
</figure>
<h2>Swarms and warfare</h2>
<p>Drone technology is increasingly used in front-line combat, in recent times most notably by <a href="https://edition.cnn.com/2023/06/03/europe/ukraine-secretive-drone-program-russia-war-intl/index.html">Ukrainian forces</a> in the <a href="https://www.cfr.org/global-conflict-tracker/conflict/conflict-ukraine">ongoing Russia-Ukraine conflict</a>. However, as it stands, conventional drone technology requires <a href="https://www.airuniversity.af.edu/Portals/10/ASOR/Journals/Volume-1_Number-4/Lowther.pdf">one-to-one supervision</a>. </p>
<p><a href="https://www.army.mod.uk/news-and-events/news/2022/09/british-army-carries-out-successful-swarming-drone-capability/">Current defence research</a> aims to facilitate communication between drones, allowing one controller to operate swarms of drones. The development of such technology promises to vastly improve the <a href="https://www.military.africa/2023/06/drone-swarm-technology-an-overview/">scalability</a>, <a href="https://cdnsciencepub.com/doi/10.1139/juvs-2018-0009">reconnaissance</a> and <a href="https://www.eurasiantimes.com/edited-drone-swarms-controlling-drone-swarms-pentagon/">striking</a> capabilities of combat drones by allowing for continuous information relay within groups of drones. </p>
<p>As research delves deeper into swarming, we find a world where collective action creates complexity, adaptability, and efficiency. As technology evolves, the role of swarm intelligence is set to grow, intertwining our world with the fascinating dynamics of swarms.</p><img src="https://counter.theconversation.com/content/209180/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Samuel Johnson receives funding from Biotechnology and Biological Sciences Research Council (BBSRC). </span></em></p>Research into swarming in nature is transforming healthcare, gambling and the military.Samuel Johnson, DPhil Candidate in Mathematical Biology, University of OxfordLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2073942023-06-21T12:05:26Z2023-06-21T12:05:26ZAnnouncing The Conversation’s new investigative unit – we’re looking for collaborators in academia<figure><img src="https://images.theconversation.com/files/532145/original/file-20230615-27-m4cp5x.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Kurt Eichenwald, left, The Conversation's investigative editor, and Georgia State professor David Maimon working.</span> <span class="attribution"><span class="source">The Conversation</span>, <a class="license" href="http://creativecommons.org/licenses/by-nd/4.0/">CC BY-ND</a></span></figcaption></figure><p>Today we published our <a href="https://theconversation.com/us/investigations/mailbox-robberies-drop-accounts-checkwashing-fraud-gangs-of-fullz">first story</a> from The Conversation’s investigative unit, a significant expansion of our mission to ensure expert knowledge reaches the widest public audience possible.</p>
<p>Our incredible editorial team already works with academics every day to publish research news and explanatory journalism. These stories cover topics from space to politics to the economy to the environment. And because our content is free for all to read and republish, it reaches about 18 million people each month thanks to our partnerships with hundreds of news outlets. </p>
<p>But two things have happened since The Conversation started in the U.S. over eight years ago. First, our editing team has taken stock of the deep investigative research so many academics do. Often those research projects are in the public interest – but never reach a broad audience. Second, investigative journalism across the country has declined precipitously as news outlets consolidate, close and lay off experienced – and expensive – watchdog reporters.</p>
<p>Academics have deep knowledge of complex topics. They bring rigorous methodologies and peer-reviewed research to their specialties that even the best reporters at most media outlets do not possess. They’re focused on a wide range of topics that today’s smaller newsrooms are not staffed to cover and may not even be aware of. </p>
<p>Journalists know how to find a narrative and ethically talk to real people, and they have the platform and editing ability to reach the public at large.
Bringing academics and journalists together can help stem the decline of
important beat and investigative journalism nationally – but also locally. This is particularly true in specialized beats like genetics and business that increasingly intersect with people’s lives but have scant attention from the media. </p>
<p>Certainly, this is not a brand-new idea. Top-rate news outlets have sometimes used academics in more rigorous ways than merely quoting them in stories. Yet those efforts have mostly been one-offs and not scaled, because it would take a bridge between academia and journalism to make it happen. We believe The Conversation is that bridge.</p>
<p>Thanks to support from Arnold Ventures, we have been able to make this a reality. We have hired <a href="https://kurteichenwald.com/">Kurt Eichenwald</a> as our inaugural senior investigative editor. Kurt is a New York Times bestselling author of six nonfiction books and a longtime investigative reporter at The New York Times and several other national outlets.</p>
<p>Here is what Kurt wrote about how our first investigation was conducted: </p>
<p><em>The <a href="https://theconversation.com/us/investigations/mailbox-robberies-drop-accounts-checkwashing-fraud-gangs-of-fullz">investigation Heists Worth Billions</a> is a collaboration between The Conversation U.S. and <a href="https://ebcs.gsu.edu/">Georgia State University’s Evidence-Based Cybersecurity Research Group</a>, directed by professor <a href="https://ebcs.gsu.edu/profile/david-maimon-2/">David Maimon</a>.</em></p>
<p><em>The research group develops techniques to improve cybersecurity by studying online criminal networks and observing underground markets. Two years ago, Maimon and his team <a href="https://theconversation.com/how-cybercriminals-turn-paper-checks-stolen-from-mailboxes-into-bitcoin-173796">saw a large number of stolen checks</a> flooding those markets. They then noticed the marketing of drop accounts – bank accounts created by using fictitious identities that money is “dropped” into – that can be used for check fraud. Criminals rapidly figured out that an array of frauds could be facilitated by drop accounts, and markets exploded with the necessary tools and instructions to perpetrate those scams</em>.</p>
<p><em>Building on the research group’s work, The Conversation investigated gangs who relied on, purchased or sold drop accounts, identities, checks and other materials to perpetrate their criminal activities. We reviewed thousands of pages of court records and government documents, obtained transcripts of wiretaps and other official investigative material, bank documents, and online communications between co-conspirators. In addition, we interviewed officials in law enforcement, government and the banking industry. And, to better understand how these crimes were committed, we also spoke with reformed fraudsters and hackers who had previously participated in drop account schemes.</em> </p>
<p><em>The investigation by Maimon’s group and The Conversation provides an unprecedented look into a vast, secret enterprise that has long stayed hidden in the darkest reaches of the internet, and it exposed the huge scale of financial losses suffered by the public because of this crime wave.</em></p>
<p>Expanding our partnerships with academics to include investigative topics
is a natural evolution for The Conversation and a way for us to have a deeper impact than through the daily journalism we excel at. </p>
<p>And we want ideas. If you are an academic with an idea, please email <a href="mailto:investigations@theconversation.com">investigations@theconversation.com</a>. If you are a journalist with an idea that would benefit from data sets or deep academic knowledge, drop us a line. </p>
<p>This effort is new and experimental, and it comes with challenges. But given the deep knowledge locked in academia – and the talent of journalists to humanize data and research – I am convinced that the public will benefit from these collaborations.</p>
<p>We’d love to hear from you.</p>
<hr>
<p></p><div style="float:right;width:205px;">
<a href="https://theconversation.com/us/investigations/mailbox-robberies-drop-accounts-checkwashing-fraud-gangs-of-fullz"><img alt="Graphic showing a masked criminal on a stamp and saying 'Heists worth billions'" class="ls-is-cached lazyloaded" data-src="https://images.theconversation.com/files/532510/original/file-20230618-28-hh0pox.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=200&fit=clip" src="https://images.theconversation.com/files/532510/original/file-20230618-28-hh0pox.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=200&fit=clip"></a></div>
<em>This article relates to <strong><a href="https://theconversation.com/us/investigations/mailbox-robberies-drop-accounts-checkwashing-fraud-gangs-of-fullz">Heists Worth Billions</a></strong>, an investigation from The Conversation that found criminal gangs using sham bank accounts and secret online marketplaces to steal from almost anyone – and uncovered just how little being done to combat the fraud.</em><p></p>
<p>• <strong><a href="https://theconversation.com/how-to-protect-yourself-from-drop-account-fraud-tips-from-our-investigative-unit-206840">How to protect yourself from drop account fraud – tips from our investigative unit</a>.</strong></p>
<p>• <strong><a href="https://theconversation.com/behind-the-scenes-of-the-investigation-heists-worth-billions-207158">Behind the scenes of the investigation</a></strong></p><img src="https://counter.theconversation.com/content/207394/count.gif" alt="The Conversation" width="1" height="1" />
Why The Conversation U.S. started an investigative unit.Beth Daley, Executive Editor and General ManagerLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2063572023-06-04T11:19:06Z2023-06-04T11:19:06ZAI clones made from user data pose uncanny risks<figure><img src="https://images.theconversation.com/files/529395/original/file-20230531-19-st714q.jpg?ixlib=rb-1.1.0&rect=0%2C0%2C4000%2C2664&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Personal data can be used to create an AI that can mimic a user's behaviour.</span> <span class="attribution"><span class="source">(Shutterstock)</span></span></figcaption></figure><p>Imagine, if you will, a digital doppelgänger. A clone that looks, talks and behaves just like you, created from the depths of artificial intelligence, reflecting your every mannerism with eerie precision. As thrilling as it might sound, how would you feel about it?</p>
<p>Our research at the University of British Columbia turns the spotlight onto this very question. With advancements in deep-learning technologies such as <a href="https://www.forbes.com/sites/lutzfinger/2022/09/08/overview-of-how-to-create-deepfakesits-scarily-simple/">interactive deepfake applications</a>, <a href="https://www.vulture.com/article/ai-singers-drake-the-weeknd-voice-clones.html">voice conversion</a> and <a href="https://www.theguardian.com/film/2020/may/25/are-virtual-actors-about-to-put-hollywoods-humans-out-of-work-miquela">virtual actors</a>, it’s possible to digitally replicate an individual’s appearance and behaviour. </p>
<p>This mirror image of an individual created by artificial intelligence is referred to as an “AI clone.” Our study dives into the murky waters of <a href="https://doi.org/10.1145/3579524">what these AI clones could mean for our self-perception, relationships and society</a>. We identified three types of risks posed by AI replicas: doppelgänger-phobia, identity fragmentation and living memories.</p>
<p><div data-react-class="InstagramEmbed" data-react-props="{"url":"https://www.instagram.com/p/CpGFxAPPZdw","accessToken":"127105130696839|b4b75090c9688d81dfd245afe6052f20"}"></div></p>
<h2>Cloning AI</h2>
<p>We defined AI clones as digital representations of individuals, designed to reflect some or multiple aspects of the real-world “source individual.”</p>
<p>Unlike fictitious characters in digital environments, these AI clones are based on existing people, potentially mimicking their visual likeness, conversational mannerisms, or behavioural patterns. The depth of replication can vary greatly, from replicating certain distinct features to creating a near-perfect digital twin.</p>
<p>AI clones are also interactive technologies, designed to interpret user and environmental input, conduct internal processing and produce perceptible output. And crucially, these are AI-based technologies built on personal data. </p>
<p>As the volume of personal data we generate continues to grow, so too does the fidelity of these AI clones in replicating our behaviour.</p>
<h2>Fears, fragments and false memories</h2>
<p>We presented 20 participants with eight speculative scenarios involving AI clones. The participants were diverse in ages and backgrounds, and reflected on their emotions and the potential impacts on their self-perception and relationships.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/529443/original/file-20230531-21818-ikzgfl.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="a row of identical men with a barcode tattoo on their necks" src="https://images.theconversation.com/files/529443/original/file-20230531-21818-ikzgfl.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/529443/original/file-20230531-21818-ikzgfl.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=399&fit=crop&dpr=1 600w, https://images.theconversation.com/files/529443/original/file-20230531-21818-ikzgfl.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=399&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/529443/original/file-20230531-21818-ikzgfl.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=399&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/529443/original/file-20230531-21818-ikzgfl.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=501&fit=crop&dpr=1 754w, https://images.theconversation.com/files/529443/original/file-20230531-21818-ikzgfl.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=501&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/529443/original/file-20230531-21818-ikzgfl.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=501&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Vast amounts of user-generated data can be used to create AI clones.</span>
<span class="attribution"><span class="source">(Shutterstock)</span></span>
</figcaption>
</figure>
<p>First, we found that doppelgänger-phobia was a fear not only of the AI clone itself, but also of its potential misuse. Participants worried that their digital counterparts could exploit and displace their identity.</p>
<p>Secondly, there was the threat of identity fragmentation. The creation of replicas threatens the unique individuality of the person being cloned, causing a disturbance to their cohesive self-perception. In other words, people worry that they might lose parts of their uniqueness and individuality in the replication process.</p>
<p>Lastly, participants expressed concerns about what we described as “living memories.” This relates to the danger posed when a person interacts with a clone of someone they have an existing relationship with. Participants worried that it could lead to a misrepresentation of the individual, or that they would develop an over-attachment to the clone, altering the dynamics of interpersonal relationships.</p>
<h2>Preserving human values</h2>
<p>It is evident that the development and deployment of AI clones wield profound implications. Our study not only contributes valuable insights to the critical dialogue on ethical AI, but it also proposes a new framework for AI clone design that prioritizes identity and authenticity. </p>
<p>The onus lies with all stakeholders — including designers, developers, policymakers and end-users — to navigate this uncharted territory responsibly. This involves conscientiously considering moderation and user-generated data expiration strategies to prevent misuse and over-reliance.</p>
<p>Further, it’s imperative to recognize that the implications of AI clone technologies on personal identity and interpersonal relationships represent just the tip of the iceberg. As we continue to tread the delicate path of this burgeoning field, our study findings can serve as a compass guiding us to prioritize ethical considerations and human values above all.</p><img src="https://counter.theconversation.com/content/206357/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Dongwook Yoon receives funding from Korea Institute of Science and Technology and NSERC.</span></em></p>User-generated data can be used to build AI clones who can sound and behave like the source individual.Dongwook Yoon, Assistant Professor, Computer Science, University of British ColumbiaLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2030072023-04-16T07:19:07Z2023-04-16T07:19:07ZGhana’s fishing industry has a ‘golden seaweed’ problem - how citizen science can help<figure><img src="https://images.theconversation.com/files/518702/original/file-20230331-26-htrwcm.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">The Western shores of Ghana are struggling with a seaweed influx</span> <span class="attribution"><span class="source">Prosper Amihere</span></span></figcaption></figure><p><a href="https://www.sciencedirect.com/topics/agricultural-and-biological-sciences/sargassum">Sargassum</a> is a genus of brown seaweed. Over <a href="https://www.sciencedirect.com/topics/agricultural-and-biological-sciences/sargassum">300 species</a> are distributed across the world in both temperate and tropical climates. The species <em>fluitans</em> and <em>natans</em> are unique because they spend their life cycle floating on the ocean, never attaching to the sea floor. <a href="https://www.researchgate.net/publication/299450775_The_protection_and_management_of_the_Sargasso_Sea_The_golden_floating_rainforest_of_the_Atlantic_Ocean_Summary_Science_and_Supporting_Evidence_Case">Other</a> seaweed species reproduce and begin life on the ocean floor .</p>
<p>Pelagic (open sea) sargassum has been <a href="https://www.researchgate.net/publication/299450775_The_protection_and_management_of_the_Sargasso_Sea_The_golden_floating_rainforest_of_the_Atlantic_Ocean_Summary_Science_and_Supporting_Evidence_Case">described</a> as the “golden rainforest of the ocean” because of the floating ecosystem it supports in the Sargasso Sea, in the western Atlantic Ocean. Pelagic sargassum also occurs naturally in the <a href="https://www.ingentaconnect.com/content/umrsmas/bullmar/2004/00000074/00000001/art00007">Gulf of Mexico</a> and the <a href="https://www.sciencedirect.com/science/article/abs/pii/S002209811400121X?via%3Dihub">Caribbean</a>.</p>
<p>Floating sargassum first began arriving en masse on shores across the tropical Atlantic in 2011. Up to<a href="https://www.sciencedirect.com/science/article/abs/pii/S2211926421000072"> 10,000 tonnes</a> arrived daily during a particularly severe peak season. Severe years since then include 2015, 2018 and 2022 – but every year there is a significant influx. In the Caribbean, there has been good progress in understanding the pelagic sargassum seaweed. We now <a href="https://link.springer.com/article/10.1007/s10236-022-01511-1">have a better idea</a> of where it’s coming from: likely a new southern area of growth.</p>
<p>In 2009 the <a href="https://www.researchgate.net/publication/308751900_Preliminary_investigation_into_the_chemical_composition_of_the_invasive_brown_seaweed_Sargassum_along_the_West_Coast_of_Ghana">first reports emerged</a> of pelagic sargassum sightings off the coast of Ghana. Densities have increased annually ever since. In early March 2023, large quantities have again arrived on the shores of the Western Region of the country. </p>
<p>Pelagic sargassum is beneficial in lots of ways. Marine species such as eels, white marlin and dolphin fish depend on it for spawning grounds in the Sargasso Sea. Commercial fish species including tuna depend on it for food.</p>
<p>But problems arise when large quantities are experienced near and on the shorelines of coastal communities. Algal and seaweed blooms are becoming more common in seas and oceans worldwide, both far offshore and nearshore. There is only <a href="https://www.ipcc.ch/srocc/chapter/technical-summary/">limited evidence</a> of a link between pelagic sargassum blooms and climate change, but warming oceans do seem to be one cause of the <a href="https://www.ipcc.ch/srocc/chapter/summary-for-policymakers/">rise in other harmful algal blooms</a> in coastal areas.</p>
<p>The pelagic sargassum off Ghana’s coast is affecting communities’ ability to fish and use their beaches. </p>
<h2>Importance of fishing in Ghana</h2>
<p>More than <a href="https://www.sciencedirect.com/science/article/abs/pii/S0308597X06000492">60% of Ghana’s citizens</a> live within 200km of the coast and 42% within 100km. The artisanal or small-scale fisheries sector <a href="https://www.fao.org/ghana/news/detail-events/en/c/1401751/">employs an estimated</a> 80% of the country’s fishers. </p>
<p>Around 2.4 million people, about 10% of the population, work in the fisheries sector. Small-scale fisheries contribute about <a href="https://faolex.fao.org/docs/pdf/gha178892.pdf">4.5% to Ghana’s gross domestic product (GDP)</a>. The coastal regions of the country are particularly dependent on fisheries for their livelihoods. </p>
<p>Marine fisheries are the primary source of income for more than <a href="https://www.tandfonline.com/doi/abs/10.1080/23308249.2014.962687">200 coastal villages</a>, including about 200,000 fishers with approximately 2 million dependants .</p>
<h2>Impacts of pelagic sargassum on fishing communities</h2>
<p>In a recent <a href="https://www.researchsquare.com/article/rs-1861970/v1">study</a> we assessed the impact of pelagic sargassum on the livelihoods of fishers on Ghana’s coast. Through group discussions, surveys, field observations and photographs, we documented the experiences of fishers. Most (70%) of those we spoke to across three sites in the region – Sanzule, Beyin and Newtown – depended on fishing for their sustenance and livelihood. </p>
<p>The seaweed had significantly affected the livelihoods of fishing dependent communities in the western region. Pelagic sargassum had reduced their fish catch by getting tangled in nets. It made up most of the catch instead of fish. </p>
<p>Pelagic sargassum also inhibits fishing by:</p>
<ul>
<li><p>breaking nets and filling nets</p></li>
<li><p>clogging outboard motors on boats </p></li>
<li><p>creating seaweed mats that are impossible to navigate boats through</p></li>
<li><p>causing skin irritations </p></li>
<li><p>causing unbearable discomfort from the smell. </p></li>
</ul>
<p>These initial results highlight the urgency of finding ways to manage pelagic sargassum in western Africa. But to achieve this, we also need more data and an improved understanding of what is happening.</p>
<h2>Solutions</h2>
<p>To identify solutions, it is important to know what types of seaweed are arriving, their origins, uses and how to monitor them. It is possible that the answers are the same for west Africa as in the Caribbean. But this is an assumption. Very little is known about pelagic sargassum in West Africa.</p>
<p>What we do know, as scientists, is that answering some of these questions for places like Ghana might be even trickier than it was for the Caribbean. </p>
<p>Take <a href="https://www.frontiersin.org/articles/10.3389/fmars.2022.914501/full">forecasting and early warning</a>, for example. These processes rely on sufficient cloud-free satellite imagery in combination with an understanding of ocean processes and weather systems. That means detecting where the pelagic sargassum is at any given moment, in combination with ocean process models, to forecast where it will be later. </p>
<p>But west African coasts tend to have significant cloud cover. Methods that worked well in the Caribbean may not work in Ghana.</p>
<p><a href="https://www.sartrac.org/news/exchanging-sargassum-knowledge-in-the-western-region-of-ghana-january-2023/">Recently</a>, a team from universities in Ghana, the UK and Jamaica came together to explore how ground-based photography might create a useful dataset to better understand the seasonality and volumes of pelagic sargassum arriving in Ghana, using citizen science methods. </p>
<p><a href="https://education.nationalgeographic.org/resource/citizen-science/">Citizen science</a> recognises the important role that the public can play in research, and invites non-researchers to be part of data collection and analysis.</p>
<p>Citizen science is now applied worldwide for coastal monitoring but focuses almost exclusively on <a href="https://theconversation.com/rising-sea-levels-are-driving-faster-erosion-along-senegals-coast-182571">coastal erosion</a>. Coastal erosion work, such as the <a href="https://www.coastsnap.com/">CoastSnap platform</a>, documents how the physical structure of coastlines changes across days, months and years. The citizen science monitoring is achieved by installing a simple metal pole and some signage requesting that a passersby take a quick photo with their mobile phone and share it online or via an app. </p>
<p>In our <a href="https://www.sartrac.org/news/exchanging-sargassum-knowledge-in-the-western-region-of-ghana-january-2023/">work</a>, we have come together with schools and community members from Beyin, Esiama and Sanzule in the western region of Ghana to apply CoastSnap to study pelagic sargassum. Together, we have installed three of these metal monitoring posts. Teachers and community members are now photographing the impacts that the seaweed has on people’s lives when it arrives. </p>
<p>Gradually, we will learn more about pelagic sargassum impacts and adaptation options in west Africa.</p><img src="https://counter.theconversation.com/content/203007/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Dr Sien van der Plank was part of the SARTRAC project that received funding from UKRI ESRC GCRF ES/T002964/1 and the University of Southampton UKRI ESRC IAA.</span></em></p><p class="fine-print"><em><span>Kwasi Addo Appeaning was part of the SARTRAC project that received funding from UKRI ESRC GCRF ES/T002964/1 and the University of Southampton UKRI ESRC IAA.</span></em></p><p class="fine-print"><em><span>Philip-Neri Jayson-Quashigah was part of the SARTRAC project that received funding from UKRI ESRC GCRF ES/T002964/1 and the University of Southampton UKRI ESRC IAA.</span></em></p><p class="fine-print"><em><span>Dr. Winnie N. A. Sowah was part of the SARTRAC project that received funding from UKRI ESRC GCRF ES/T002964/1 and the University of Southampton UKRI ESRC IAA. </span></em></p>The seaweed invasion of parts of the Ghanaian shoreline is affecting coastal inhabitants.Sien van der Plank, Senior Research Fellow, University of SouthamptonKwasi Addo Appeaning, Lecturer in Marine and Fisheries Sciences, University of GhanaPhilip-Neri Jayson-Quashigah, Research Fellow, University of GhanaWinnie N. A. Sowah, Lecturer, Department of Marine and Fisheries Sciences, University of GhanaLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2026432023-04-04T02:32:34Z2023-04-04T02:32:34ZThe environmental cost of data centres is substantial, and making them energy-efficient will only solve half the problem<figure><img src="https://images.theconversation.com/files/518382/original/file-20230330-29-mpih7w.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">shutterstock</span> </figcaption></figure><p>In 2022, Indonesia hosted around <a href="https://dataindonesia.id/digital/detail/apjii-pengguna-internet-indonesia-21563-juta-pada-20222023">215 million internet users</a>, who spent an average of <a href="https://www.kompas.com/edu/read/2022/05/31/103951971/berapa-lama-orang-indonesia-menggunakan-internet-setiap-hari?page=all">more than eight hours</a> on the internet every day. </p>
<p>This includes activities with lower data traffic such as using ride-hailing apps and sending emails, to heavier ones like video streaming and big data processing.</p>
<p>Data and internet have made people’s lives easier, but we often dismiss their environmental cost. To store and manage digital information, we need massive spaces called data centers that uses a lot of energy and water to control its temperature.</p>
<p>Humans’ increasing dependence on data has caused a growing demand for data centres.</p>
<p>Indonesia currently houses <a href="https://web.pln.co.id/media/siaran-pers/2023/02/dukung-pengembangan-data-pln-siap-pasok-kebutuhan-listrik-ebt-ke-pusat-data-di-seluruh-indonesia">94 data centres</a> with leading names including China’s Alibaba Cloud, the United States’ Google Cloud and state-owned-enterprise PT Telkom Indonesia. They have a combined capacity of 727.1 megawatts.</p>
<p>As an illustration, a small one megawatt data centre requires electricity to power <a href="https://theconversation.com/we-are-ignoring-the-true-cost-of-water-guzzling-data-centres-167750">1,000 houses and consumes around 26 million liters of water per year</a></p>
<p>The number is expected to rise by about <a href="https://industri.kontan.co.id/news/menakar-prospek-bisnis-data-center-di-indonesia">20% each year</a> due to the country’s growing digital activity.</p>
<p>The Indonesian government plans on constructing <a href="https://web.pln.co.id/media/siaran-pers/2023/02/dukung-pengembangan-data-pln-siap-pasok-kebutuhan-listrik-ebt-ke-pusat-data-di-seluruh-indonesia">four National Data Centers by 2026</a> - each boasting a capacity of up to 40 megawatts.</p>
<p>Additionally, demand for data centres may shift to Indonesia from Singapore – <a href="https://www.straitstimes.com/business/companies-markets/singapore-to-be-more-selective-of-data-centre-investments-for-sustainable-growth">the region’s digital powerhouse</a> – which is currently limiting further growth of data centres due to environmental sustainability concerns.</p>
<p>Indonesia’s capital Jakarta is now one of the <a href="https://app.dcbyte.com/knight-frank-data-centres-report/Q3-2022">fastest-growing data centre</a> hubs in the Asia Pacific, second only to Melbourne, Australia.</p>
<p>It is important for these data centres to adopt sustainable practices to reduce their environmental impact. </p>
<p>So, how can we create a digital regime that is more environmentally sustainable?</p>
<h2>Make them transparent and efficient</h2>
<p>To control and minimise the environmental cost of growing data centres, the Indonesian government and industry players need to create plans for more sustainable methods of operation and reporting. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/518388/original/file-20230330-600-i75ehz.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="(Google)" src="https://images.theconversation.com/files/518388/original/file-20230330-600-i75ehz.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/518388/original/file-20230330-600-i75ehz.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/518388/original/file-20230330-600-i75ehz.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/518388/original/file-20230330-600-i75ehz.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/518388/original/file-20230330-600-i75ehz.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/518388/original/file-20230330-600-i75ehz.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/518388/original/file-20230330-600-i75ehz.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Steam rises above the cooling towers in Google’s The Dalles data center in Oregon, US. These plumes of water vapor create a quiet mist at dusk.</span>
</figcaption>
</figure>
<p>Currently, the government only imposes <a href="https://jdih.esdm.go.id/peraturan/PP%20No.%2070%20Thn%202009.pdf">a mandatory reporting programme</a> on energy consumption for large energy consumers of about 70 gigawatt hours per year.</p>
<p>However, <a href="https://www.adb.org/sites/default/files/publication/236621/ino-data-center-market.pdf">there are no data centres</a> that meet this threshold. A hyperscale data center consumes around <a href="https://datacentremagazine.com/articles/efficiency-to-loom-large-for-data-centre-industry-in-2023">20-50 megawatt hours annually</a>, far below the 70 gigawatt hours threshold. So reports on actual energy consumption from individual data centres have been unknown.</p>
<p>To ensure transparency, data centre operators – regardless of how much energy they consume – should collaborate to systematically report energy consumption. </p>
<p>The public must also have access to the sector’s annual energy usage for greater accountability.</p>
<hr>
<p>
<em>
<strong>
Baca juga:
<a href="https://theconversation.com/it-takes-a-lot-of-energy-for-machines-to-learn-heres-why-ai-is-so-power-hungry-151825">It takes a lot of energy for machines to learn – here's why AI is so power-hungry</a>
</strong>
</em>
</p>
<hr>
<p>On top of this, the government should require the electricity usage of data centres to be efficient. </p>
<p>In Singapore, for instance, <a href="https://www.straitstimes.com/tech/singapore-pilots-new-scheme-to-grow-data-centre-capacity-with-green-targets">data centres are required</a> to have a ‘power usage effectiveness’ of 1.3 (the closer they are to 1, the more efficient a centre is).</p>
<p>The rule pushes operators to design and operate their data centres in the most energy-efficient way possible. This includes the usage of modern energy-efficient machines, which will significantly cut energy consumption and save costs in the long run.</p>
<p>The government should also encourage data centre operators to adopt more environmentally-sustainable energy sources. </p>
<p>Currently, several data centres in Indonesia have received Renewable Energy Certificates to prove their commitment to using renewable energy to power up their facilities from Indonesia’s state-owned electricity company, PLN. </p>
<p>In the future, the government could encourage more companies to adopt a similar approach, for example by giving tax incentives for industry players using renewable energy. </p>
<p>In Europe, a group of data centre operators, including the US’ Amazon Web Services and Google, have committed to buy enough renewable energy to match <a href="https://www.climateneutraldatacentre.net/wp-content/uploads/2021/06/CNDCP-Policy-Paper_FINAL.pdf">75% of their total energy consumption by 2025 and 100% by 2030</a>.</p>
<h2>Behavioural adjustment in digital space is needed</h2>
<p>Making sure data centres run as efficiently as possible is still only half of the battle. </p>
<p>Emissions from data centres are not the only concern. There are also negative environmental impacts from the advanced use of data, many of which are difficult to measure. </p>
<p>For instance, there are environmental costs from increased consumerism due to the ability of big data and algorithms to flood users with the “right” advertisements, and to keep users engaged in social media and e-commerce platforms.</p>
<p>Efficient data centres may become susceptible to <a href="https://www.oecd-forum.org/posts/the-jevons-paradox-and-rebound-effect-are-we-implementing-the-right-energy-and-climate-change-policies">the “Jevons Paradox”</a>, where their optimised operations could encourage increased growth and resource consumption in the long term.</p>
<p>At the end of the day, technological innovations and efficiency alone cannot achieve sustainability – it has to be accompanied by a continuous behavioural adjustment in the digital space. Educating Indonesian users on the tangible impacts of their digital activities is an essential step.</p><img src="https://counter.theconversation.com/content/202643/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Tiola Allain tidak bekerja, menjadi konsultan, memiliki saham, atau menerima dana dari perusahaan atau organisasi mana pun yang akan mengambil untung dari artikel ini, dan telah mengungkapkan bahwa ia tidak memiliki afiliasi selain yang telah disebut di atas.</span></em></p>Indonesia must reduce data center energy use and promote digital responsibility to lessen environmental impact.Tiola Allain, Researcher, Center for Indonesian Policy StudiesLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2029492023-04-02T11:46:46Z2023-04-02T11:46:46ZBest time to play Tim Hortons’ Roll up to Win? The middle of the night dramatically increases your odds<figure><img src="https://images.theconversation.com/files/518865/original/file-20230401-16-sk7jdj.jpg?ixlib=rb-1.1.0&rect=31%2C7%2C5145%2C3437&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">A professor of statistics has used game data from Tim Hortons Roll up to Win to figure out the best time to play.</span> <span class="attribution"><span class="source">(Shutterstock)</span></span></figcaption></figure><iframe style="width: 100%; height: 100px; border: none; position: relative; z-index: 1;" allowtransparency="" allow="clipboard-read; clipboard-write" src="https://narrations.ad-auris.com/widget/the-conversation-canada/best-time-to-play-tim-hortons--roll-up-to-win-the-middle-of-the-night-dramatically-increases-your-odds" width="100%" height="400"></iframe>
<p>Tim Hortons’ iconic Roll up the Rim contest began in 1985 and went largely unaltered for 25 years. The format was simple: buy a coffee, roll up the rim of the paper cup and see if you’ve won a prize. But this all changed in 2020.</p>
<p>Amid the emergence of a global pandemic, the game went digital. Buying Tim Hortons products still earned you entries to the contest, but these were now stored on the company’s loyalty app. It was then up to you when to play these so-called “digital rolls.” Because players no longer roll up an actual coffee cup rim, the contest is now called Roll up to Win.</p>
<p>Last week I made national news as “<a href="https://kitchener.ctvnews.ca/meet-the-ontario-stats-prof-who-claims-he-can-t-stop-beating-roll-up-to-win-1.6332975">the stats prof who cracked Roll up to Win”</a>. I boosted my odds in Tim Hortons’ annual coffee contest to 80 per cent and then shared my strategy with the nation. </p>
<p>My approach sounds simple — play when other people aren’t — but it took data, determination and drinking a lot of coffee to find the optimal approach.</p>
<p>Here’s the story of the statistics behind the headlines.</p>
<h2>Digital element changes odds</h2>
<p><a href="https://theconversation.com/heres-how-i-cracked-roll-up-the-rim-and-won-almost-every-time-136939">As I explained in 2020</a>, “digital rolls” introduced an element of strategy to the game. There’s one major trick to increasing your odds: play when other people aren’t.</p>
<p>So when are the fewest people playing?</p>
<p>On the surface this seems simple: play in the middle of the night when most Canadians are asleep. But in a country spanning six time zones, finding the single best time is a challenging calculation.</p>
<p>In previous contests I made an educated guess that 4:30 a.m. Eastern was the sweet spot: not too late and not too early. But an educated guess is still a guess, and if I wanted to find the true Goldilocks zone of free coffee I’d need data.</p>
<p>This year, that’s exactly what Tim Hortons gave me.</p>
<figure class="align-right ">
<img alt="A screenshot of the Tim Hortons app showing More than 2,520,293 prizes already awarded!" src="https://images.theconversation.com/files/518854/original/file-20230401-16-as76wk.png?ixlib=rb-1.1.0&q=45&auto=format&w=237&fit=clip" srcset="https://images.theconversation.com/files/518854/original/file-20230401-16-as76wk.png?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=1085&fit=crop&dpr=1 600w, https://images.theconversation.com/files/518854/original/file-20230401-16-as76wk.png?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=1085&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/518854/original/file-20230401-16-as76wk.png?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=1085&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/518854/original/file-20230401-16-as76wk.png?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=1364&fit=crop&dpr=1 754w, https://images.theconversation.com/files/518854/original/file-20230401-16-as76wk.png?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=1364&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/518854/original/file-20230401-16-as76wk.png?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=1364&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">A screenshot of the Tim Hortons app showing how many prizes have been awarded.</span>
<span class="attribution"><span class="source">Tim Hortons</span></span>
</figcaption>
</figure>
<h2>Getting data from the app</h2>
<p>Logging in to the app on the first day of the contest on March 6, a large message grabbed my attention: “More than 308,619 prizes already awarded!” This is an enticement to play — so many winners already! — but it’s also a valuable piece of information.</p>
<p>I waited five minutes and refreshed the page. The message changed: “More than 309,949 prizes already awarded!” Another 1,330 prizes had been won.</p>
<p>This gave me an idea.</p>
<p>I periodically refreshed the page, logging the time and number of prizes awarded. My theory: the number of prizes won should correlate with the number of people playing. By tracking these data I could build a model of Roll up to Win player behaviour and, by extension, calculate exactly when I should play.</p>
<p>Tracking online data is common in scientific research and often employs software to download information automatically. Automated procedures are usually against the rules of contests like this, however, and Roll up to Win was no exception. So I gathered the data manually.</p>
<p>Refreshing the page myself throughout the day — and night — I was able to approximately track the prizes. But I did have other things to do, so there were gaps in my logs. In statistical terms, I had what’s known as missing data.</p>
<figure class="align-center ">
<img alt="A graph showing total prizes awarded over time. The line is steepest during daytime hours, plateauing during the night. Red circles highlight areas with missing data." src="https://images.theconversation.com/files/518855/original/file-20230401-14-zc2kou.jpeg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/518855/original/file-20230401-14-zc2kou.jpeg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=346&fit=crop&dpr=1 600w, https://images.theconversation.com/files/518855/original/file-20230401-14-zc2kou.jpeg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=346&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/518855/original/file-20230401-14-zc2kou.jpeg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=346&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/518855/original/file-20230401-14-zc2kou.jpeg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=435&fit=crop&dpr=1 754w, https://images.theconversation.com/files/518855/original/file-20230401-14-zc2kou.jpeg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=435&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/518855/original/file-20230401-14-zc2kou.jpeg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=435&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">During this two-day period the number of prizes awarded continued to climb, with the win rate slowing during the nighttime hours. Gaps in the data can also be seen. All times Eastern.</span>
<span class="attribution"><span class="source">(author provided)</span></span>
</figcaption>
</figure>
<p>Missing data are common in real-world analysis. Examples include unreturned or incomplete surveys, patients missing medical appointments or even misplaced or corrupted data files.</p>
<h2>Statistical challenges</h2>
<p>This can present statistical challenges depending on how — and why — we have gaps in our records. A patient might miss their appointment because they were too unwell to travel, or maybe just because their car wouldn’t start. These two scenarios provide different information, requiring different solutions.</p>
<p>My missing data problem was comparatively simple. My goal was to fill in the gaps that arose when I was sleeping, travelling or otherwise away from my keyboard.</p>
<p>Using the data I did have, I looked for patterns. The most prizes were being won between 9 a.m. and 1 p.m. Eastern, the fewest around 3 a.m. This repeated each day and I was able to use this to my advantage.</p>
<p>To map our mathematical models onto the real world, statisticians often make assumptions. I assumed that player behaviour patterns would be similar day-to-day. This was a fairly strong assumption — I had some evidence of a slightly later start on Sunday mornings — but it seemed a reasonable one for my problem.</p>
<h2>Weighting the data</h2>
<p>I could then combine each day’s data and employ <a href="https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/weighting-methods.pdf">a technique known as weighting</a>. Days where I had logged more observations were given more importance — or weight — in my calculations. I was then able to use statistical methods to “join the dots” and map out the overall shape of player behaviour.</p>
<figure class="align-center ">
<img alt="" src="https://images.theconversation.com/files/518869/original/file-20230401-16-li9ktq.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/518869/original/file-20230401-16-li9ktq.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=341&fit=crop&dpr=1 600w, https://images.theconversation.com/files/518869/original/file-20230401-16-li9ktq.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=341&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/518869/original/file-20230401-16-li9ktq.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=341&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/518869/original/file-20230401-16-li9ktq.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=428&fit=crop&dpr=1 754w, https://images.theconversation.com/files/518869/original/file-20230401-16-li9ktq.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=428&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/518869/original/file-20230401-16-li9ktq.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=428&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Stats professor Michael Wallace has needed a lot of coffee to record his data on the Roll up to Win game.</span>
<span class="attribution"><span class="license">Author provided</span></span>
</figcaption>
</figure>
<p>With this, my educated guess of previous years became a data-driven estimate. The best time to play was 3:16 a.m. Eastern — over an hour earlier than I was playing in the past — and the worst was 11:46 a.m. There is always some statistical uncertainty in an analysis, but playing around these times should give you the highest and lowest chances of winning.</p>
<p>There was one last step: I had to test my results. My analysis was predicated on another assumption: that the number of available prizes was consistent through the day. Maybe fewer people were winning at 3 a.m. because there were fewer prizes, not fewer players. Luckily, this was an assumption I could test.</p>
<h2>3:16 a.m. is the golden hour</h2>
<p>I racked up 60 rolls and split them in half, playing 30 around the 3:16 a.m. mark and the rest just before lunchtime. I won 23 times in the early hours compared to just five times later on. No big prizes — mostly a lot of free coffee — but I got the result I was hoping for: statistically strong evidence that my theory was correct.</p>
<p>I reached out to a local journalist who’d worked with me in the past. I thought this might be a fun little story about applying statistics to the real world, with a hint of local colour as a University of Waterloo professor. Then things snowballed. By the end of the week I’d appeared on countless radio stations and even nationally broadcast television shows including CTV’s Your Morning and CBC’s The National.</p>
<figure>
<iframe width="440" height="260" src="https://www.youtube.com/embed/LGCfTZAOZMI?wmode=transparent&start=0" frameborder="0" allowfullscreen=""></iframe>
</figure>
<p>While the interviews were a great opportunity to showcase how stats can be more than just equations in a textbook, many outlets spotted a potential flaw. If everyone starts playing at 3:16 a.m., won’t the strategy change?</p>
<p>The short answer is yes, and it illustrates <a href="https://plato.stanford.edu/entries/game-theory/">a concept from another area of study: game theory</a>. Sometimes when everyone knows the best strategy it can turn into the worst strategy.</p>
<p>That said, I don’t think everyone will be getting up in the middle of the night to win a free coffee, so it should remain a good time to play. </p>
<p>I’ll be up in the early hours tracking the data for the last week of the contest — all entries must be played by April 9 — to see if the strategy needs updating. It’s a few more late nights, but I think I have enough caffeine for that.</p><img src="https://counter.theconversation.com/content/202949/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Michael Wallace does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>Statistics have many real-world applications — including what’s the best time to play Tim Hortons’ Roll up to Win contest. A stats prof explains how he found the precise time with the best odds.Michael Wallace, Associate Professor, Department of Statistics and Actuarial Science, University of WaterlooLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/2010642023-03-21T20:28:16Z2023-03-21T20:28:16ZCan the heat from running computers help grow our food? It’s complicated<p>Digital technologies are changing how food is produced. And it’s more than <a href="https://doi.org/10.1016/j.compag.2022.106879">harvesting robots</a> that are arriving on the scene. Companies are now pairing data centres with greenhouses, capturing the <a href="https://substance.etsmtl.ca/en/heating-greenhouse-with-data-centre-waste-heat">heat emitted by computing hardware and reusing it to grow crops indoors</a>. </p>
<p>The new <a href="https://www.qscale.com/">QScale</a> data centre development in Lévis, Que. is one such project. The company claims that it will “<a href="https://datacentremagazine.com/data-centres/spotlight-qscale-bringing-green-growth-quebec">produce 2,800 tonnes of small fruit and more than 80,000 tonnes of tomatoes per year</a>” in greenhouses to be constructed adjacent to the facility. </p>
<p>In promotional campaigns, QScale picks up on the growing public attention to make food systems more local amid <a href="https://theconversation.com/inflation-bites-how-rising-food-costs-affect-nutrition-and-health-196048">supply chain disruptions</a> and rising grocery costs.</p>
<p>As social scientists researching the environmental footprint of digital technologies, we’re interested in the potential benefits and drawbacks of this new emerging connection. </p>
<h2>Data centres coming in hot</h2>
<p>Every time we access content online — whether it is a video or the latest social media post — it is sent to our device by a different computer, usually located <a href="https://theconversation.com/the-factories-of-the-past-are-turning-into-the-data-centers-of-the-future-70033">in a large data centre</a>. Also known as a “server farm,” a data centre is typically a warehouse-like building that hosts hundreds of computer servers that store, process and transmit big swaths of data. </p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/dark-data-is-killing-the-planet-we-need-digital-decarbonisation-190423">'Dark data' is killing the planet – we need digital decarbonisation</a>
</strong>
</em>
</p>
<hr>
<p>Data centres are increasingly criticized for their carbon footprint. The majority of emissions result from <a href="https://doi.org/10.1109/HPCA51647.2021.00076">manufacturing the hardware</a> they use. Servers also run day and night, continuously <a href="https://doi.org/10.1038/d41586-018-06610-y">consuming energy</a> and <a href="https://doi.org/10.1038/492174a">emitting heat</a>. Backup generators guarantee uninterrupted data flow. </p>
<figure class="align-center ">
<img alt="electric connection grid at a data centre" src="https://images.theconversation.com/files/515081/original/file-20230314-16-3noh3b.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/515081/original/file-20230314-16-3noh3b.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=338&fit=crop&dpr=1 600w, https://images.theconversation.com/files/515081/original/file-20230314-16-3noh3b.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=338&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/515081/original/file-20230314-16-3noh3b.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=338&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/515081/original/file-20230314-16-3noh3b.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=424&fit=crop&dpr=1 754w, https://images.theconversation.com/files/515081/original/file-20230314-16-3noh3b.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=424&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/515081/original/file-20230314-16-3noh3b.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=424&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Servers in data centres run day and night, continuously consuming energy and emitting heat.</span>
<span class="attribution"><span class="source">(Shutterstock)</span></span>
</figcaption>
</figure>
<p>Temperature and humidity levels must be constantly <a href="https://www.akcp.com/blog/data-center-temperature-guidelines/">monitored and controlled</a> for the hardware to function efficiently and reliably. Data centres also have high <a href="https://dgtlinfra.com/data-center-water-usage/">water demands</a> for cooling purposes, so they are especially <a href="https://www.nbcnews.com/tech/internet/drought-stricken-communities-push-back-against-data-centers-n1271344">contentious in dry areas</a>. </p>
<p>To bring energy consumption and costs down, data centre operators are increasingly looking to locate their facilities in regions with a <a href="https://www.technologyreview.com/2019/06/18/134902/icelands-data-centers-are-booming-heres-why-thats-a-problem/">cold climate</a>, which often also provide access to <a href="https://www.energymonitor.ai/tech/energy-efficiency/canada-the-best-country-for-energy-efficient-data-centres/">low-priced hydropower</a> — both are part of <a href="https://www.qscale.com/campuses/sustainability">QScale’s sustainability strategy</a>.</p>
<p>In addition, the industry is now viewing <a href="https://www.datacenterdynamics.com/en/analysis/waste-heat-warms/">“waste heat” as a valuable resource</a> and opportunity to increase its sustainability score. Existing examples of heat recycling from data centres include heating <a href="https://www.reuters.com/business/sustainable-business/microsoft-data-centres-heat-finnish-homes-cutting-emissions-2022-03-17/">residential buildings</a> and <a href="https://www.networkworld.com/article/2277915/swimming-pool-heated-by-data-center-s-excess-heat.html">swimming pools</a>. Now, so-called “<a href="https://doi.org/10.1016/j.rineng.2019.100063">organic data centres</a>” propose to leverage waste heat for food production. </p>
<h2>Agricultural land re-zoned for data centres</h2>
<p>QScale’s Lévis data centre is a $867 million development, financed by both <a href="https://ici.radio-canada.ca/nouvelle/1808698/centre-traitement-donnes-haute-intensite-levis-qscale-investissement-867-millions">public and private capital</a>. The Québec provincial government acts as both investor and shareholder. </p>
<p>The government’s investment in QScale is part of two strategic goals: Supporting the province’s status as a <a href="https://www.investquebec.com/international/en/press-room/news/Quebec-A-global-hub-of-artificial-intelligence.html">hub for artificial intelligence</a> (which relies on data centre services and is especially energy intensive) and doubling the volume of <a href="https://www.quebec.ca/nouvelles/actualites/details/tout-le-quebec-sinvestit-quebec-veut-doubler-le-volume-de-culture-en-serre-dici-5-ans">greenhouse food production by 2025</a>.</p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/it-takes-a-lot-of-energy-for-machines-to-learn-heres-why-ai-is-so-power-hungry-151825">It takes a lot of energy for machines to learn – here's why AI is so power-hungry</a>
</strong>
</em>
</p>
<hr>
<p>For QScale, pairing the data centre with greenhouses is important to position itself in the public debate as <a href="https://www.journaldemontreal.com/2021/06/07/qscale-les-milliards-du-mariage-agriculture-techno">“greener” and locally owned</a> in opposition to the multinational competition. </p>
<p>For instance, <a href="https://ici.radio-canada.ca/nouvelle/1792024/google-centre-donnees-informatiques-beauharnois-terres-agricoles-quebec">Google’s new data centre</a> development in Beauharnois near Montréal will reportedly not include heat recycling and is also built on land originally zoned for agriculture, which is highly controversial.</p>
<p><div data-react-class="Tweet" data-react-props="{"tweetId":"1391871262742482950"}"></div></p>
<p>When new buildings cover valuable agricultural land, they <a href="https://doi.org/10.3390/land11060840">seal soil</a> — a vital resource for long-term food sufficiency that is <a href="https://www.cbc.ca/news/canada/toronto/ont-farmland-loss-1.6493833">already shrinking</a> due to rezoning for urban sprawl. Soil sealing means that fertile land is covered by impermeable materials like concrete. </p>
<p>The Québec government’s intervention to rezone the land slated for Google’s data centre was <a href="https://ici.radio-canada.ca/nouvelle/1792024/google-centre-donnees-informatiques-beauharnois-terres-agricoles-quebec">heavily criticized</a> by Québec’s farmers’ union, the <em>Union des producteurs agricoles</em>. The union’s spokesperson pointed out that the cultivable <a href="https://www.equiterre.org/en/articles/news-dossier-agricultural-rezoning-in-quebec">agricultural area is only two per cent</a> of the province’s territory. </p>
<p>In QScale’s case, the city of Lévis purchased farmland located next to the data centre development. This land is slated to be re-sold to QScale or other parties to develop potential greenhouses. Through its envisioned heat recuperation for indoor agriculture, QScale aims to <a href="https://www.journaldemontreal.com/2021/06/07/qscale-les-milliards-du-mariage-agriculture-techno">contribute to local food autonomy</a>. Can this promise hold up?</p>
<h2>Are greenhouses green?</h2>
<p>Due to short growing seasons, Canada relies heavily on <a href="https://agriculture.canada.ca/en/sector/horticulture/reports/statistical-overview-canadian-fruit-industry-2021#a2.3">imported fruits and vegetables</a>, especially in the winter. This dependence became clear to the public when the COVID-19 pandemic <a href="https://doi.org/10.1080/03066150.2020.1823838">disrupted supply chains</a> and highlighted the fragility of the global food system. </p>
<p>Climate change and extreme weather events pose additional challenges, which was especially <a href="https://agriculture.canada.ca/en/sector/horticulture/reports/statistical-overview-canadian-fruit-industry-2021">evident in 2021</a> when a <a href="https://theconversation.com/what-is-a-heat-dome-an-atmospheric-scientist-explains-the-weather-phenomenon-baking-california-and-the-west-185569">heat dome</a> formed over British Columbia and <a href="https://theconversation.com/how-an-atmospheric-river-drenched-british-columbia-and-led-to-floods-and-mudslides-172021">devastating floods</a> followed later that year. </p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/b-c-floods-reveal-fragile-food-supply-chains-4-ways-to-manage-the-crisis-now-and-in-the-future-172220">B.C. floods reveal fragile food supply chains — 4 ways to manage the crisis now and in the future</a>
</strong>
</em>
</p>
<hr>
<p>Taking crop production out of the fields and into indoor controlled-environment agriculture (CEA) could make the <a href="https://doi.org/10.3390/agronomy11061229">domestic food system more resilient</a> and ensure year-round access to fresh produce in Canada. Potential environmental benefits include reduced emissions from transportation and refrigeration, as well as <a href="https://doi.org/10.3390/atmos13081258">more efficient land and water use</a> and reduced reliance on agrochemical inputs. </p>
<p>However, CEA systems have high energy demands to control the <a href="https://theconversation.com/food-security-vertical-farming-sounds-fantastic-until-you-consider-its-energy-use-102657">temperature, humidity and lighting conditions</a> all year round. For example, leafy vegetable vertical farms with artificial lighting <a href="https://doi.org/10.1016/j.scitotenv.2021.150621">consume 100 times more</a> energy than those with natural sunlight. </p>
<p>Depending on the <a href="https://www.nytimes.com/2022/06/21/opinion/environment/climate-change-greenhouses-drought-indoor-farming.html">energy source</a> of the local grid, CEA greenhouse gas emissions can outweigh their benefits. The produced <a href="https://doi.org/10.1177/1178622121995819">crop variety</a> is relatively small, meaning that it cannot fully cover the nutritional needs of a local population. </p>
<p>The economic sustainability of CEA is also <a href="https://doi.org/10.1007/978-3-030-34065-0_2">open to question</a>. It relies on <a href="https://www.fastcompany.com/90824702/vertical-farming-failing-profitable-appharvest-aerofarms-bowery">venture capital</a> investment that is currently drying up and a tech-start-up business model that may not be feasible for food production in the long run. </p>
<h2>Who will tend to the data centre-greenhouse crops?</h2>
<p>As it stands, agriculture in Canada and <a href="https://theconversation.com/australia-is-creating-an-underclass-of-exploited-farm-workers-unable-to-speak-up-177063">elsewhere</a> relies on the <a href="https://theconversation.com/the-cruel-trade-off-at-your-local-produce-aisle-90083">low-paid, precarious work</a> of seasonal migrants who are barred from unionizing and frequently face <a href="https://theconversation.com/migrant-farm-workers-vulnerable-to-sexual-violence-95839">abuse</a>. </p>
<figure class="align-center ">
<img alt="workers working in a greenhouse" src="https://images.theconversation.com/files/516289/original/file-20230320-24-igv467.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/516289/original/file-20230320-24-igv467.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=399&fit=crop&dpr=1 600w, https://images.theconversation.com/files/516289/original/file-20230320-24-igv467.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=399&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/516289/original/file-20230320-24-igv467.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=399&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/516289/original/file-20230320-24-igv467.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=502&fit=crop&dpr=1 754w, https://images.theconversation.com/files/516289/original/file-20230320-24-igv467.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=502&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/516289/original/file-20230320-24-igv467.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=502&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Governments must enforce labour standards, perform spontaneous inspections without prior notification of employers and ensure that workers know their rights.</span>
<span class="attribution"><span class="source">(Shutterstock)</span></span>
</figcaption>
</figure>
<p>Conditions in the greenhouse industry are <a href="https://thenarwhal.ca/covid-19-migrant-farmworkers/">not necessarily better</a>. In 2021, temporary workers at Serres Demers, Québec’s largest greenhouse operator and <a href="https://www.lapresse.ca/affaires/entreprises/2021-06-16/qscale-discute-avec-les-serres-demers/des-serres-pourraient-etre-chauffees-par-des-centres-de-donnees.php">potential partner for QScale</a>, denounced unsanitary, crowded and dilapidated <a href="https://ici.radio-canada.ca/recit-numerique/2458/serres-demers-hebergement-travailleurs-etrangers-tomates">housing conditions</a>. </p>
<p>While <a href="https://ici.radio-canada.ca/nouvelle/1822170/logements-travaillers-etrangers-temporaires-renovations-tomates">this situation has reportedly improved</a> since it made media headlines, labour struggles for farm workers in greenhouses and fields persist. </p>
<p><em>Illusion Emploi</em>, an advocacy organization for non-unionized workers in Québec, states that the problems at Serres Demers are <a href="https://www.ledevoir.com/opinion/libre-opinion/607482/libre-opinion-le-cas-des-serres-demers-n-est-pas-unique">representative of widespread labour issues</a> in the industry. The organization implores the government to take action by enforcing labour standards, performing spontaneous inspections without prior notification of employers and ensuring that workers know their rights. </p>
<h2>Complex implications</h2>
<p>The benefits of integrating digital infrastructure and agriculture are not as clear-cut as their promoters suggest. </p>
<p>While recycling heat from data centres and thereby easing energy demands of greenhouses is certainly better than letting it go to waste, the <a href="https://commonplace.knowledgefutures.org/pub/jpy7pbq0/release/1">complex implications</a> of these two newly merging industries must not be overlooked. </p>
<p>If the continuing expansion of digital infrastructures is legitimized by adding greenhouses into the mix, it could conceal other issues at stake including the significant environmental and social impacts of hardware manufacturing, land use and labour.</p><img src="https://counter.theconversation.com/content/201064/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Janna Frenzel receives funding from the Social Sciences and Humanities Research Council (SSHRC) of Canada and Concordia University.</span></em></p><p class="fine-print"><em><span>Sarah-Louise Ruder receives funding from the Social Sciences and Humanities Research Council (SSHRC) of Canada, the University of British Columbia's Public Scholars Initiative, and Future Skills Centre Canada.</span></em></p>While recuperating heat from data centres to ease greenhouse energy demands is better than letting it go to waste, we must not overlook the complex implications of these two newly merging industries.Janna Frenzel, PhD candidate in Communication Studies, Concordia UniversitySarah-Louise Ruder, PhD Candidate at the Institute for Resources, Environment and Sustainability, University of British ColumbiaLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/1981402023-03-14T12:24:44Z2023-03-14T12:24:44ZHow to use free satellite data to monitor natural disasters and environmental changes<figure><img src="https://images.theconversation.com/files/514756/original/file-20230310-14-ffq8d1.jpg?ixlib=rb-1.1.0&rect=851%2C251%2C1541%2C1099&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Over 8,000 satellites are orbiting Earth today, capturing images like this, of the Louisiana coast.</span> <span class="attribution"><a class="source" href="https://earthobservatory.nasa.gov/world-of-change/WaxLake">NASA Earth Observatory</a></span></figcaption></figure><p>If you want to track changes in the Amazon rainforest, see the full expanse of a hurricane or figure out where people need help after a disaster, it’s much easier to do with the view from a satellite orbiting a <a href="https://aerospace.csis.org/aerospace101/earth-orbit-101">few hundred miles above Earth</a>.</p>
<p>Traditionally, access to satellite data has been limited to researchers and professionals with expertise in remote sensing and image processing. However, the increasing availability of open-access data from government satellites such as <a href="https://landsat.gsfc.nasa.gov/">Landsat</a> and <a href="https://sentinels.copernicus.eu/">Sentinel</a>, and free cloud-computing resources such as <a href="https://aws.amazon.com/earth/">Amazon Web Services</a>, <a href="https://earthengine.google.com/">Google Earth Engine</a> and <a href="https://planetarycomputer.microsoft.com/">Microsoft Planetary Computer</a>, have made it possible for just about anyone to gain insight into environmental changes underway. </p>
<p>I <a href="https://wetlands.io/">work with geospatial big data</a> as a professor. Here’s a quick tour of where you can find satellite images, plus some free, fairly simple tools that anyone can use to create time-lapse animations from satellite images.</p>
<p>For example, state and urban planners – or people considering a new home – can watch over time <a href="https://images.theconversation.com/files/508816/original/file-20230208-16-ktgkpl.gif">how rivers have moved</a>, construction crept into wildland areas or <a href="https://images.theconversation.com/files/508818/original/file-20230208-15-lbcw9x.gif">a coastline eroded</a>.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/508816/original/file-20230208-16-ktgkpl.gif?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="A squiggly river moves surprisingly quickly over time." src="https://images.theconversation.com/files/508816/original/file-20230208-16-ktgkpl.gif?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/508816/original/file-20230208-16-ktgkpl.gif?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=352&fit=crop&dpr=1 600w, https://images.theconversation.com/files/508816/original/file-20230208-16-ktgkpl.gif?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=352&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/508816/original/file-20230208-16-ktgkpl.gif?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=352&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/508816/original/file-20230208-16-ktgkpl.gif?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=442&fit=crop&dpr=1 754w, https://images.theconversation.com/files/508816/original/file-20230208-16-ktgkpl.gif?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=442&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/508816/original/file-20230208-16-ktgkpl.gif?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=442&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Landsat time-lapse animations show the river dynamics in Pucallpa, Peru.</span>
<span class="attribution"><span class="source">Qiusheng Wu, NASA Landsat</span></span>
</figcaption>
</figure>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/508818/original/file-20230208-15-lbcw9x.gif?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="Animation shows the shoreline shrinking." src="https://images.theconversation.com/files/508818/original/file-20230208-15-lbcw9x.gif?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/508818/original/file-20230208-15-lbcw9x.gif?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=349&fit=crop&dpr=1 600w, https://images.theconversation.com/files/508818/original/file-20230208-15-lbcw9x.gif?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=349&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/508818/original/file-20230208-15-lbcw9x.gif?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=349&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/508818/original/file-20230208-15-lbcw9x.gif?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=439&fit=crop&dpr=1 754w, https://images.theconversation.com/files/508818/original/file-20230208-15-lbcw9x.gif?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=439&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/508818/original/file-20230208-15-lbcw9x.gif?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=439&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">A Landsat time-lapse shows the shoreline retreat in the Parc Natural del Delta, Spain.</span>
<span class="attribution"><span class="source">Qiusheng Wu, NASA Landsat</span></span>
</figcaption>
</figure>
<p>Environmental groups can monitor deforestation, the effects of climate change on ecosystems, and how other human activities like irrigation are <a href="https://images.theconversation.com/files/508817/original/file-20230208-23-o026h9.gif">shrinking bodies of water</a> like <a href="https://earthobservatory.nasa.gov/world-of-change/AralSea">Central Asia’s Aral Sea</a>. And disaster managers, aid groups, scientists and anyone interested can monitor natural disasters such as <a href="https://images.theconversation.com/files/508821/original/file-20230208-16-151a1t.gif">volcanic eruptions</a> and <a href="https://images.theconversation.com/files/508822/original/file-20230208-14-3xtadg.gif">wildfires</a>.</p>
<figure class="align-center ">
<img alt="The lake, created by damming the river, has shrunk over time." src="https://images.theconversation.com/files/514741/original/file-20230310-142-kyqos5.gif?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/514741/original/file-20230310-142-kyqos5.gif?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=367&fit=crop&dpr=1 600w, https://images.theconversation.com/files/514741/original/file-20230310-142-kyqos5.gif?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=367&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/514741/original/file-20230310-142-kyqos5.gif?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=367&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/514741/original/file-20230310-142-kyqos5.gif?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=461&fit=crop&dpr=1 754w, https://images.theconversation.com/files/514741/original/file-20230310-142-kyqos5.gif?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=461&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/514741/original/file-20230310-142-kyqos5.gif?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=461&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">GOES images show the decline of the crucial Colorado River reservoir Lake Mead since the 1980s and the growth of neighboring Las Vegas.</span>
<span class="attribution"><span class="source">Qiusheng Wu, NOAA GOES</span></span>
</figcaption>
</figure>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/508821/original/file-20230208-16-151a1t.gif?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="A volcanic eruption bursts into view." src="https://images.theconversation.com/files/508821/original/file-20230208-16-151a1t.gif?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/508821/original/file-20230208-16-151a1t.gif?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=352&fit=crop&dpr=1 600w, https://images.theconversation.com/files/508821/original/file-20230208-16-151a1t.gif?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=352&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/508821/original/file-20230208-16-151a1t.gif?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=352&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/508821/original/file-20230208-16-151a1t.gif?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=442&fit=crop&dpr=1 754w, https://images.theconversation.com/files/508821/original/file-20230208-16-151a1t.gif?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=442&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/508821/original/file-20230208-16-151a1t.gif?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=442&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">A GOES satellite time-lapse shows the Hunga Tonga volcanic eruption on Jan. 15, 2022.</span>
<span class="attribution"><span class="source">Qiusheng Wu, NOAA GOES</span></span>
</figcaption>
</figure>
<h2>Putting Landsat and Sentinel to work</h2>
<p>There are over <a href="https://www.geospatialworld.net/prime/business-and-industry-trends/how-many-satellites-orbiting-earth">8,000 satellites orbiting the Earth</a> today. You can see a live map of them at <a href="https://www.keeptrack.space/">keeptrack.space</a>.</p>
<p>Some transmit and receive radio signals for communications. Others provide global positioning system (GPS) services for navigation. The ones we’re interested in are Earth observation satellites, which collect images of the Earth, day and night.</p>
<p><strong>Landsat:</strong> The longest-running Earth satellite mission, <a href="https://landsat.gsfc.nasa.gov/">Landsat</a>, has been collecting imagery of the Earth since 1972. The latest satellite in the series, <a href="https://www.usgs.gov/landsat-missions/landsat-9">Landsat 9</a>, was launched by NASA in September 2021.</p>
<p>In general, Landsat satellite data has a spatial resolution of about 100 feet (about 30 meters). If you think of pixels on a zoomed-in photo, each pixel would be 100 feet by 100 feet. Landsat has a temporal resolution of 16 days, meaning the same location on Earth is imaged approximately once every 16 days. With both Landsat 8 and 9 in orbit, we can get a global coverage of the Earth <a href="https://www.mdpi.com/1424-8220/20/22/6631">once every eight days</a>. That makes comparisons easier.</p>
<p><a href="https://www.usgs.gov/landsat-missions/landsat-data-access">Landsat data</a> has been freely available to the public since 2008. During the <a href="https://en.wikipedia.org/wiki/2022_Pakistan_floods">Pakistan flood of 2022</a>, scientists used Landsat data and free cloud-computing resources to determine the flood extent and <a href="https://share.gishub.org/pakistan_floods/">estimated the total flooded area</a>.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/508723/original/file-20230207-31-kvunlf.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="Images show how the flood covered about a third of Pakistan." src="https://images.theconversation.com/files/508723/original/file-20230207-31-kvunlf.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/508723/original/file-20230207-31-kvunlf.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=390&fit=crop&dpr=1 600w, https://images.theconversation.com/files/508723/original/file-20230207-31-kvunlf.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=390&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/508723/original/file-20230207-31-kvunlf.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=390&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/508723/original/file-20230207-31-kvunlf.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=489&fit=crop&dpr=1 754w, https://images.theconversation.com/files/508723/original/file-20230207-31-kvunlf.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=489&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/508723/original/file-20230207-31-kvunlf.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=489&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Landsat satellite images showing a side-by-side comparison of southern Pakistan in August 2021 (one year before the floods) and August 2022 (right)</span>
<span class="attribution"><span class="source">Qiusheng Wu, NASA Landsat</span></span>
</figcaption>
</figure>
<p><strong>Sentinel:</strong> <a href="https://sentinels.copernicus.eu/">Sentinel</a> Earth observation satellites were launched by the European Space Agency (ESA) as part of the <a href="https://www.copernicus.eu/en">Copernicus program</a>. Sentinel-2 satellites have been collecting optical imagery of the Earth since 2015 at a spatial resolution of 10 meters (33 feet) and a temporal resolution of 10 days.</p>
<p><strong>GOES:</strong> The images you’ll see most often in U.S. weather forecasting come from NOAA’s Geostationary Operational Environmental Satellites, or <a href="https://www.goes.noaa.gov/">GOES</a>. They orbit above the equator at the <a href="https://www.nesdis.noaa.gov/current-satellite-missions/currently-flying/geostationary-satellites">same speed Earth rotates</a>, so they can provide continuous monitoring of Earth’s atmosphere and surface, giving detailed information on weather, climate, and other environmental conditions. <a href="https://www.goes-r.gov/multimedia/dataAndImageryImagesGoes-16.html">GOES-16</a> and <a href="https://www.goes-r.gov/multimedia/dataAndImageryImagesGoes-17.html">GOES-17</a> can image the Earth at a spatial resolution of about 1.2 miles (2 kilometers) and a temporal resolution of five to 10 minutes.</p>
<figure class="align-center ">
<img alt="Animation showing swirling clouds off the coast and the long river of moisture headed for California." src="https://images.theconversation.com/files/514739/original/file-20230310-142-1ln3m3.gif?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/514739/original/file-20230310-142-1ln3m3.gif?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=353&fit=crop&dpr=1 600w, https://images.theconversation.com/files/514739/original/file-20230310-142-1ln3m3.gif?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=353&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/514739/original/file-20230310-142-1ln3m3.gif?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=353&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/514739/original/file-20230310-142-1ln3m3.gif?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=444&fit=crop&dpr=1 754w, https://images.theconversation.com/files/514739/original/file-20230310-142-1ln3m3.gif?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=444&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/514739/original/file-20230310-142-1ln3m3.gif?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=444&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">A GOES satellite shows an atmospheric river arriving on the West Coast in 2021.</span>
<span class="attribution"><span class="source">Qiusheng Wu, GOES</span></span>
</figcaption>
</figure>
<h2>How to create your own visualizations</h2>
<p>In the past, creating a Landsat time-lapse animation of a specific area required extensive data processing skills and several hours or even days of work. However, nowadays, free and user-friendly programs are available to enable anyone to create animations with just a few clicks in an internet browser.</p>
<p>For instance, I created an <a href="https://huggingface.co/spaces/giswqs/Streamlit">interactive web app</a> for my students that anyone can use to generate time-lapse animations quickly. The user zooms in on the map to find an area of interest, then draws a rectangle around the area to save it as a GeoJSON file – a file that contains the geographic coordinates of the chosen region. Then the user uploads the GeoJSON file to the web app, chooses the satellite to view from and the dates and submits it. It takes the app about 60 seconds to then produce a time-lapse animation.</p>
<figure>
<iframe width="440" height="260" src="https://www.youtube.com/embed/VVRK_-dEjR4?wmode=transparent&start=0" frameborder="0" allowfullscreen=""></iframe>
<figcaption><span class="caption">How to create satellite time-lapse animations.</span></figcaption>
</figure>
<p>There are several other useful tools for easily creating satellite animations. Others to try include <a href="https://jdbcode.github.io/Snazzy-EE-TS-GIF/">Snazzy-EE-TS-GIF</a>, an Earth Engine App for creating Landsat animations, and <a href="https://planetarycomputer.microsoft.com/docs/overview/explorer/">Planetary Computer Explorer</a>, an explorer for searching and visualizing satellite imagery interactively.</p><img src="https://counter.theconversation.com/content/198140/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Qiusheng Wu receives funding from NASA. He is an Amazon Visiting Academic and a Google Developer Export (GDE) for Earth Engine. </span></em></p>Time-lapse animations that once took days to create are now easy to build with publicly available satellite images and free online tools.Qiusheng Wu, Assistant Professor of Geography and Sustainability, University of TennesseeLicensed as Creative Commons – attribution, no derivatives.