tag:theconversation.com,2011:/us/topics/dna-data-storage-41253/articlesDNA data storage – The Conversation2021-05-10T12:31:27Ztag:theconversation.com,2011:article/1578562021-05-10T12:31:27Z2021-05-10T12:31:27ZDNA ‘Lite-Brite’ is a promising way to archive data for decades or longer<figure><img src="https://images.theconversation.com/files/399089/original/file-20210505-13-w8uydd.jpg?ixlib=rb-1.1.0&rect=0%2C0%2C3072%2C2041&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">A simple two-dimensional grid can convey a lot of information – whether making pictures with Lite-Brite or storing data in DNA.</span> <span class="attribution"><a class="source" href="https://flickr.com/photos/52076395@N00/3801567473/">Justin Day/Flickr</a>, <a class="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA</a></span></figcaption></figure><p><em>The <a href="https://theconversation.com/us/topics/research-brief-83231">Research Brief</a> is a short take about interesting academic work.</em></p>
<h2>The big idea</h2>
<p>We and our colleagues have developed a way to store data using pegs and pegboards made out of DNA and retrieving the data with a microscope – a molecular version of the <a href="https://shop.hasbro.com/en-us/product/lite-brite-ultimate-classic:A0579FDA-BDE1-4888-840A-1862576A318E">Lite-Brite</a> toy. Our prototype stores information in patterns using DNA strands spaced about 10 nanometers apart. Ten nanometers is more than a thousand times smaller than the diameter of a human hair and about 100 times smaller than the diameter of a bacterium. </p>
<p>We tested our <a href="https://doi.org/10.1038/s41467-021-22277-y">digital nucleic acid memory</a> (dNAM) by storing the statement “Data is in our DNA!\n.” We described the research in a paper published in the journal Nature Communications on April 22, 2021.</p>
<p>Previous methods for retrieving data in DNA require the DNA to be sequenced. Sequencing is the process of <a href="https://www.genome.gov/about-genomics/fact-sheets/DNA-Sequencing-Fact-Sheet">reading the genetic code of strands of DNA</a>. Though it is a powerful tool in medicine and biology, it wasn’t designed with DNA memory in mind.</p>
<p>Our approach uses a microscope to read the data optically. Because the DNA pegs are positioned closer than half the wavelength of visible light, we used <a href="https://www.sciencemag.org/features/2016/05/superresolution-microscopy">super-resolution microscopy</a>, which circumvents the <a href="https://courses.lumenlearning.com/physics/chapter/27-6-limits-of-resolution-the-rayleigh-criterion/">diffraction limit</a> of light. This provides a way to read the encoded data without sequencing the DNA.</p>
<figure class="align-center ">
<img alt="Three columns and three rows of dots against a dark background" src="https://images.theconversation.com/files/399090/original/file-20210505-15-1moc0w7.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/399090/original/file-20210505-15-1moc0w7.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=488&fit=crop&dpr=1 600w, https://images.theconversation.com/files/399090/original/file-20210505-15-1moc0w7.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=488&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/399090/original/file-20210505-15-1moc0w7.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=488&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/399090/original/file-20210505-15-1moc0w7.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=613&fit=crop&dpr=1 754w, https://images.theconversation.com/files/399090/original/file-20210505-15-1moc0w7.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=613&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/399090/original/file-20210505-15-1moc0w7.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=613&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Digital nucleic acid memory (dNAM) uses light-emitting DNA strands to read the data optically rather than requiring sequencing. The left column shows patterns designed for encoding data, the middle column shows optical readout of data stored in DNA using super-resolution microscopy, and the right column shows Atomic Force Microscope images of the DNA nanostructures. Each 6 x 8 pegboard is roughly 70 x 90 nanometers.</span>
<span class="attribution"><span class="source">Nucleic Acid Memory Institute at Boise State University</span>, <a class="license" href="http://creativecommons.org/licenses/by-nd/4.0/">CC BY-ND</a></span>
</figcaption>
</figure>
<p>The patterns of DNA strands – the pegs – light up when fluorescently labeled DNA bind to them. Because the fluorescent strands are short, they rapidly bind and unbind. This causes them to blink, making it easier to separate one peg from another and read the stored information. We use the fluorescent patterns of each pegboard as a code to store chunks of data.</p>
<p>The microscope can image hundreds of thousands of the DNA pegs in a single recording, and our error-correction algorithms ensure we recover all of the data. After accounting for the bits used by the algorithms, our prototype was able to read data at a density of 330 gigabits per square centimeter.</p>
<h2>Why it matters</h2>
<p>You’re not likely to have a DNA storage device in your phone or computer, at least anytime soon. DNA data storage is promising for archival storage – storing large amounts of information for long periods of time. DNA can store a lot of information in a small space. It would be possible to <a href="https://doi.org/10.1038/nmat4594">store every tweet, email, photo, song, movie and book ever created</a> in a volume equivalent to a jewelry box. And data stored in DNA could last for centuries, given that the biomolecule has a <a href="https://www.nature.com/news/dna-has-a-521-year-half-life-1.11555">half-life of over 500 years</a>.</p>
<h2>What other research is being done</h2>
<p>Researchers have been <a href="https://www.scientificamerican.com/article/dna-data-storage-is-closer-than-you-think/">developing methods of storing data in DNA</a> for several decades. Those methods involve the design and synthesis of unique strings of information made from the DNA nucleotides adenine (A), thymine (T), cytosine (C) and guanine (G). This information is recovered by reading the strings using sequencing technology.</p>
<h2>What’s next</h2>
<p>From here, our goal is to increase the amount of data that we can store in dNAM, decrease the amount of time it takes to write and read the data, and use the technique to encrypt data.</p>
<p>[<em>Get our best science, health and technology stories.</em> <a href="https://theconversation.com/us/newsletters/science-editors-picks-71/?utm_source=TCUS&utm_medium=inline-link&utm_campaign=newsletter-text&utm_content=science-best">Sign up for The Conversation’s science newsletter</a>.]</p><img src="https://counter.theconversation.com/content/157856/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Will Hughes receives funding from the National Science Foundation, the Semiconductor Research Corporation, and the State of Idaho.</span></em></p><p class="fine-print"><em><span>George Dickinson receives funding from the National Science Foundation, the Semiconductor Research Corporation, and the State of Idaho</span></em></p><p class="fine-print"><em><span>Luca Piantanida receives funding from the National Science Foundation, the Semiconductor Research Corporation, and the State of Idaho. </span></em></p>DNA has been storing vast amounts of biological information for billions of years. Researchers are working to harness DNA for archiving data. A new method uses light to simplify the process.Will Hughes, Professor of Materials Science & Engineering, Boise State UniversityGeorge David Dickinson, Post-Doctoral Research Scientist in Materials Science and Engineering, Boise State UniversityLuca Piantanida, Post-Doctoral Research Scientist in Materials Science and Engineering, Boise State UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/1594782021-04-22T15:03:00Z2021-04-22T15:03:00ZA Jane Austen quote encoded in plastic molecules demonstrates the potential for a new kind of data storage<figure><img src="https://images.theconversation.com/files/396510/original/file-20210422-24-1266hgn.jpg?ixlib=rb-1.1.0&rect=194%2C126%2C6198%2C3201&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">
</span> <span class="attribution"><a class="source" href="https://www.shutterstock.com/image-illustration/data-transmission-channel-motion-digital-flow-777221539">Shutterstock/spainter_vfx</a></span></figcaption></figure><p>The words “if one scheme of happiness fails, human nature turns to another” were originally published in 1814 in Jane Austen’s Mansfield Park. At the time, the words were printed using revolutionary steam-powered printers that could roll through over a thousand sheets of paper an hour.</p>
<p>Since the early 2000s, it’s been possible to read all of <a href="https://theconversation.com/jane-austen-200-years-on-why-we-still-love-her-heroes-heroines-and-houses-80451">Jane Austen’s works</a> online, including Mansfield Park. But as of this year, the list of places her words are published has had a bizarre addition.</p>
<p>In <a href="http://dx.doi.org/10.1016/j.xcrp.2021.100393">a new study</a>, a team from the University of Texas at Austin has encoded a quote from Mansfield Park on a tiny plastic molecule. The researchers hope the study will help prove the viability of a new kind of technology for storing data.</p>
<p>Archiving has always been a problem. Even the most carefully stored and protected copies of Mansfield Park’s original print run are showing their age, with the ink fading and the paper crinkling. </p>
<p>We produce <a href="https://theconversation.com/digital-hoarders-weve-identified-four-types-which-are-you-153111">more data</a> than ever. <a href="https://techjury.net/blog/how-much-data-is-created-every-day/#gref">Current estimates</a> put it at 1.145 trillion megabytes of data a day – if someone attempted to download all of it using current internet speeds it would take almost two billion years.</p>
<p>But the vast data centres we currently use to store data – largely using magnetic tape – is not up to the job. Even though there’s constant evolution in hardware and software, the requirements for faster processing powers and <a href="https://theconversation.com/nevens-law-why-it-might-be-too-soon-for-a-moores-law-for-quantum-computers-120706">smaller components</a> means lack of effective storage is creating a bottleneck and the current system cannot keep up with demand. </p>
<p>The search is on for smaller, more stable and efficient alternatives to digital hard drives. Recent research interest has fallen on <a href="https://theconversation.com/the-libraries-of-the-future-will-be-made-of-dna-86274">DNA data storage</a> – the idea we could use the building blocks of life, the system nature spent millions of years evolving to encode the blueprint for our species, as a means of storing and reading our own history and knowledge. When one scheme of technology fails, human nature turns to another.</p>
<figure class="align-center ">
<img alt="Computer generated image of DNA and circuits." src="https://images.theconversation.com/files/396514/original/file-20210422-23-1rmtjyz.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/396514/original/file-20210422-23-1rmtjyz.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/396514/original/file-20210422-23-1rmtjyz.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/396514/original/file-20210422-23-1rmtjyz.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/396514/original/file-20210422-23-1rmtjyz.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/396514/original/file-20210422-23-1rmtjyz.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/396514/original/file-20210422-23-1rmtjyz.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Researchers are looking into DNA data storage.</span>
<span class="attribution"><a class="source" href="https://www.shutterstock.com/image-illustration/dna-circuits-board-concept-bioinformatics-data-1295361625">Shutterstock/CI Photos</a></span>
</figcaption>
</figure>
<p>As a molecule, DNA lasts a long time – 500,000 years if stored correctly – far outstripping paper and ink’s potential lifetime by an order of several magnitudes. But it must be kept sterile and needs careful handling. This can make storing information using DNA expensive.</p>
<p>But there’s another class of materials known to last even longer than DNA. These synthetic products discovered a century ago have stability, ease of manufacturing and storage potential that far outstrips DNA. These plastics, or more specifically, polymers, are long-chain molecules that can most easily be described as containing multiple repeat units –- each known as a monomer. </p>
<p>Researchers calculate the four base pairs – pairs of DNA building blocks – can store 10¹⁹ bits of information per cubic metre. But when we use polymers, we have more than four building blocks to choose from. In fact, there are as many monomer choices as you can locate commercially, so there’s potential to increase the information density exponentially.</p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/the-libraries-of-the-future-will-be-made-of-dna-86274">The libraries of the future will be made of DNA</a>
</strong>
</em>
</p>
<hr>
<p>For their monomers, or building blocks, the team in Texas used sixteen different amino alcohols. Stitching these together, they created eighteen longer molecules, called oligomers, each made up of individual monomers. Within the longer molecules, combinations of monomers corresponded to specific letters, with cheaper monomers corresponding to more commonly used letters.</p>
<p>When read back, the molecules reveal Jane Austen’s quote from Mansfield Park.</p>
<blockquote>
<p>if one scheme of happiness fails, human nature turns to another; if the first calculation is wrong, we make a second better: we find comfort somewhere.</p>
</blockquote>
<p>The researchers chose the passage because they found it to be “uplifting in these trying times, and it is easily understood without the context in the book”.</p>
<p>The team certainly found “if the first calculation is wrong, we make a second better”. Their first independent expert to validate their method could only recover 98.7% of the data. With some modifications to the reading process, they were able to return full deciphering of all 158 monomer sequences without errors.</p>
<figure class="align-center ">
<img alt="An image of a book showing a quote from Jane Austen's Mansfield Park written across 18 molecules." src="https://images.theconversation.com/files/396344/original/file-20210421-17-1s32nic.jpeg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/396344/original/file-20210421-17-1s32nic.jpeg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=465&fit=crop&dpr=1 600w, https://images.theconversation.com/files/396344/original/file-20210421-17-1s32nic.jpeg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=465&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/396344/original/file-20210421-17-1s32nic.jpeg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=465&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/396344/original/file-20210421-17-1s32nic.jpeg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=584&fit=crop&dpr=1 754w, https://images.theconversation.com/files/396344/original/file-20210421-17-1s32nic.jpeg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=584&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/396344/original/file-20210421-17-1s32nic.jpeg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=584&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">The authors chose the quote because it was ‘uplifting in these trying times’.</span>
<span class="attribution"><a class="source" href="https://www.eurekalert.org/multimedia/emb/262163.php?from=499783">Sarah Moor</a>, <span class="license">Author provided</span></span>
</figcaption>
</figure>
<h2>Plastic data storage</h2>
<p>Plastics may not be the most obvious choice for data storage, but on consideration, they are an extremely suitable material to use. </p>
<p>Since we began mass manufacturing plastics, we’ve traditionally stuck to either employing a single monomer type per product or simple combinations of one or two monomers. These have come to dominate our ways of life. </p>
<p>Plastics are stable under normal environmental conditions. While in many instances this is a disadvantage, like when they make their way into the environment, in some instances it does serve extremely useful functions. </p>
<p>Over the past 50 years, researchers have made remarkable strides reducing the dispersity (the molecular variation – usually in either mass or shape) of synthetic polymers and improving our ability to control the sequence distribution of the monomers. </p>
<p>The new study from Texas has shown by going outside of the confines of DNA, you can encode more complex information in a far smaller chain length, due to the increased choice of monomers available. </p>
<p>Future use will probably depend on the commercial availability of monomers, such as which amino alcohols can be readily accessible from renewable sources. But the potential is vast. Encoding short-chain polymers is not far removed from encoding DNA, and the process of reading the sequences is similar in both.</p>
<p>The team in Texas plans to look into bottlenecks regarding the scalability of this method, interrogating the speed and efficiency of the writing and reading processes.</p>
<p>Though the original paper text of Mansfield Park will inevitably fade in the coming centuries, one small fragment of it has been preserved on a polymer for perhaps centuries to come –- as long as we have the equipment available to decode it. As the final fragment of the quote declares, “we find comfort somewhere”.</p><img src="https://counter.theconversation.com/content/159478/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Thomas Swift receives funding from the Royal Society of Chemistry, the Royal Society, Northern Powerhouse, European Regional Development Fund and other commercial research contracts.</span></em></p>Storing information using polymers could be more efficient than DNA data storage.Thomas Swift, Lecturer in Polymer Chemistry, University of BradfordLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/1266762020-02-23T13:13:49Z2020-02-23T13:13:49ZNew DNA test that reveals a child’s true age has promise, but ethical pitfalls<figure><img src="https://images.theconversation.com/files/314570/original/file-20200210-109943-1amwqvc.jpg?ixlib=rb-1.1.0&rect=71%2C8%2C1845%2C1057&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Epigenetic clocks are a fascinating new technology, but some potential applications are controversial.</span> <span class="attribution"><a class="source" href="https://pixabay.com/photos/fantasy-portrait-clock-time-2790666/">(Pixabay/Stefan Keller)</a>, <a class="license" href="http://creativecommons.org/licenses/by/4.0/">CC BY</a></span></figcaption></figure><p>Epigenetic clocks are a new type of biological test currently capturing the attention of the scientific community, private companies and governmental agencies because of their potential to <a href="https://www.wired.com/story/new-tests-use-epigenetics-to-guess-how-fast-youre-aging/">reveal an individual’s “true” age</a>.</p>
<p>Over the past two years, companies such as <a href="https://www.chronomics.com/">Chronomics</a> and <a href="https://www.mydnage.com/">MyDNage</a> have started to sell epigenetic age tests to the public online, and the life insurance company <a href="https://www.yousurance.com/">YouSurance</a> has announced that it would be testing the epigenetic age of their policy holders to assign them to risk groups. Forensic scientists are also contemplating how epigenetic clocks could help <a href="https://www.sciencedirect.com/science/article/pii/S0168952518300611">determine the age of suspected criminals</a>.</p>
<p>Recently, the <a href="https://cmmt.ubc.ca/kobor-lab/">Kobor Lab</a> developed the first <a href="https://www.pnas.org/content/early/2019/10/09/1820843116.short">pediatric epigenetic clock</a> designed specifically for testing the age of young people, with an eye towards its applications in research and medical settings. This test uses a small sample of cells collected cheaply and easily from a cheek swab, and can predict a child’s age with a degree of precision within approximately four months.</p>
<p>But pediatric epigenetic clocks are likely to have non-medical applications as well. They could soon be used in immigration cases to prove the age of undocumented <a href="https://www.nature.com/articles/d41586-018-06121-w">migrants seeking asylum as minors</a>. Other future uses can be imagined, such as for child labour and trafficking surveillance, or even for the identification of child combatants in armed conflicts. </p>
<p>As researchers in bioethics, sociology and medical genetics, we are interested in the potential benefits and risks of this fascinating yet controversial new technology for individuals and society.</p>
<h2>The science of epigenetics</h2>
<p>Epigenetic clocks emerge from the <a href="https://www.nature.com/articles/d41586-019-03877-7">field of epigenetics</a>, which examines how chemical marks can regulate gene expression and help us understand how aging and disease processes work.</p>
<p>Epigenetics is the study of small molecules that bind to DNA or to the proteins DNA wraps around, changing how genes are read. These small molecules don’t change the linear sequence of the DNA, but they can turn genes on or off by opening or closing the 3D structure of DNA.</p>
<figure class="align-center ">
<img alt="" src="https://images.theconversation.com/files/314562/original/file-20200210-109943-1t8x8yn.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/314562/original/file-20200210-109943-1t8x8yn.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=338&fit=crop&dpr=1 600w, https://images.theconversation.com/files/314562/original/file-20200210-109943-1t8x8yn.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=338&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/314562/original/file-20200210-109943-1t8x8yn.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=338&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/314562/original/file-20200210-109943-1t8x8yn.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=424&fit=crop&dpr=1 754w, https://images.theconversation.com/files/314562/original/file-20200210-109943-1t8x8yn.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=424&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/314562/original/file-20200210-109943-1t8x8yn.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=424&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Small molecules cannot change the linear structure of DNA, but they can turn genes on and off by opening or closing the 3D structure of DNA.</span>
<span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/blue-dna-structure-isolated-background-3d-1238405779">(Shutterstock)</a></span>
</figcaption>
</figure>
<p>If we think of genes as light bulbs, epigenetic marks can nudge the dimmer switch up or down, but they can’t change the colour of the light.</p>
<p>Some epigenetic marks can change in response to a <a href="https://www.niehs.nih.gov/research/supported/health/envepi/index.cfm">person’s environment</a> or <a href="https://dx.doi.org/10.18632%2Faging.101168">lifestyle</a>. Epigenetic tests may provide information about individuals that a genetic test alone can’t reveal — such as exposures to trauma, stress, diet or pollutants.</p>
<p>Other epigenetic marks change in a very constant fashion as a person develops, grows and ages. These marks have enabled the development of different epigenetic age tests. Also known as epigenetic clocks, these tests are poised to be the first epigenetic tests ready for use.</p>
<p>However, most epigenetic tests have not yet been scientifically validated to confirm their precision and accuracy in different sub-groups of the population, and the <a href="https://doi.org/10.3389/fgene.2018.00202">ethical, legal and social implications of their use are not well understood</a>.</p>
<h2>Lessons from DNA testing</h2>
<p>Like genetic tests, epigenetic tests may eventually be used in law enforcement and immigration settings, as well as in research and medical contexts. The lessons learned from DNA testing highlight the need for caution and responsible implementation.</p>
<p>Genetic research and testing now have many uses beyond detecting disease risks and tracing ancestry. DNA tests are common tools in <a href="https://cen.acs.org/articles/95/i37/Thirty-years-DNA-forensics-DNA.html">police investigations</a> to identify suspects and victims of crimes, and they are increasingly used by <a href="https://doi.org/10.1093/jlb/lsx012">immigration agencies</a> to prove genetic relationships in family reunification efforts.</p>
<p>In 2018, the identification of the suspected <a href="https://theconversation.com/how-cops-used-a-public-genealogy-database-in-the-golden-state-killer-case-95842">Golden State Killer</a> made it clear that biological information shared with public genetic genealogy databases could be <a href="https://www.theatlantic.com/science/archive/2019/10/genetic-genealogy-dna-database-criminal-investigations/599005/">mined by law enforcement agencies</a>. This case <a href="https://www.nytimes.com/2018/04/27/health/dna-privacy-golden-state-killer-genealogy.html">raised public and legal concerns about the privacy of genetic information</a>, and the uses of DNA stored by private companies and in government databases. </p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/dna-database-sold-to-help-law-enforcement-crack-cold-cases-128674">DNA database sold to help law-enforcement crack cold cases</a>
</strong>
</em>
</p>
<hr>
<p>Due to the capacity of epigenetic tests to expose sensitive information about an individual’s developing environment, social conditions and life choices, the implementation of tests like the pediatric clock requires close attention to issues related to privacy, surveillance and basic human rights.</p>
<h2>Risks to basic human rights</h2>
<p>In an era of rising <a href="https://theconversation.com/citizens-in-the-west-should-care-about-discriminatory-immigration-policies-110312">xenophobic and protectionist immigration policies</a> across the globe, the benefits of gaining biological data should be critically considered against the risks to basic human rights inherent in the process of collecting another layer of information from a vulnerable population.</p>
<p>When genetic testing was proposed as a solution for family reunification for the thousands of children separated from their parents by U.S. Immigration and Customs Enforcement (ICE) raids and deportations, ethicists and advocacy groups <a href="https://www.vice.com/en_ca/article/evjwje/privacy-rights-group-sues-dhs-over-coercive-dna-tests-at-the-border">raised significant issues</a>, including the lack of informed consent and concerns about the long-term storage of DNA in either private databases or those previously used only for those accused of crimes.</p>
<p>The <a href="https://www.researchwithrutgers.com/en/publications/dna-testing-for-family-reunification-and-the-limits-of-biological">use of genetic tests to prove the biological relationship between family members seeking to re-unite in a country has also been criticized</a> for being ethically problematic for children in non-genetic families, and having potentially devastating consequences for members of genetic families if DNA test errors occur. These situations could impede the reunification of children with their primary caregivers.</p>
<p>Problems may also arise if epigenetic clocks are used in immigration cases before we fully understand and address their ethical, legal and social consequences.</p>
<p>For example, migrants who are minors may have been exposed to highly stressful experiences, malnutrition or medical conditions. Such exposures <a href="https://doi.org/10.1186/s13059-019-1824-y">can affect the results of epigenetic clock tests</a> which were developed based on the DNA of healthy children in developed countries. This makes their use in efforts to identify biological age problematic for both technical and ethical reasons.</p>
<h2>Responsible use of epigenetic clocks</h2>
<p>To date, there have been attempts but no official report of any police force or immigration agency successfully using an epigenetic clock test in solving a challenging criminal case or asylum claim.</p>
<p>However, it has come to our attention that researchers have been approached by governmental agencies interested in using the pediatric epigenetic clock in particular, and by migrants searching for ways to prove the age of their undocumented children in order to be granted access to legal privileges reserved only for minors.</p>
<p>The promises of epigenetics that <a href="https://doi.org/10.1093/eep/dvz019">circulate widely in public discourse</a> include the potential to control one’s genetic predisposition — such as disease risk — through lifestyle choices. With this type of attention, individuals in the general public may in fact be among the first interested in using these tests. Consumers gaining access to epigenetic tests online, and those seeking to use them to inform legal and policy decisions, should be aware of their current scientific limitations, as well as of <a href="https://doi.org/10.1038/s41576-020-0215-2">rising privacy and non-discrimination concerns</a>.</p>
<p><a href="https://doi.org/10.1038/s41597-019-0310-4">Standards of practice</a>, <a href="https://doi.org/10.1186/s13073-019-0646-6">ethical guidelines</a> and <a href="https://academic.oup.com/eep/article/5/3/dvz018/5571210">regulations</a> are critically needed to ensure the responsible use of epigenetic tests. Most urgently, there is a need to protect children and their caregivers from premature or socially inadmissible uses of pediatric epigenetic clock tests to ensure their promises are realized with their best interest in mind.</p>
<p><em>This is an updated version of a story originally published on Feb. 23, 2020. It clarifies the use of genealogy data in the Golden State Killer case.</em></p><img src="https://counter.theconversation.com/content/126676/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Charles Dupras receives funding from the Canadian Institutes of Health Research (CIHR) and the Office of the Privacy Commissioner of Canada (OPC). The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding organizations.</span></em></p><p class="fine-print"><em><span>Martine Lappé receives funding from the National Human Genome Research Institute of the National Institutes of Health. Research informing this publication was supported by NIH Award Number R00HG009154: "Behavioral Epigenetics in Children: Exploring the Social and Ethical Implications of Translation." The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.</span></em></p><p class="fine-print"><em><span>Michael S. Kobor receives funding from the Canadian Institutes of Health Research (CIHR), Natural Sciences and Engineering Research Council of Canada (NSERC), Genome Canada, National Institutes of Health (NIH), National Science Foundation (NSF), Peter Wall Institute for Advanced Studies (PWIAS), Canadian Institute For Advanced Research (CIFAR), Networks of Centres of Excellence (NCE), and the R. Howard Webster Foundation. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding organizations.
</span></em></p>Pediatric epigenetic clocks have the potential to accurately assess biological age. However, possible applications in law enforcement and immigration raise ethical issues.Charles Dupras, Postdoctoral Fellow, Center of Genomics and Policy, McGill UniversityMartine Lappé, Assistant Professor of Sociology and Science, Technology, and Society, California Polytechnic State UniversityMichael S. Kobor, Canada Research Chair in Social Epigenetics and Professor, UBC Department of Medical Genetics, University of British ColumbiaLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/1107552019-02-05T13:41:01Z2019-02-05T13:41:01ZPersonal DNA tests might help research – but they put your data at risk<figure><img src="https://images.theconversation.com/files/256605/original/file-20190131-108338-1q2hixh.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">
</span> <span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/close-businessman-holding-glowing-dna-helix-683382997?src=X3fDTVJfgabc-LLmvLaQgQ-1-53">Eetu Mustonen/Shutterstock</a></span></figcaption></figure><p>Your DNA has become a valuable commodity. Companies such as 23andMe may charge you for an analysis of your genetic profile, but they make their real money from <a href="https://www.theguardian.com/commentisfree/2018/aug/10/dna-ancestry-tests-cheap-data-price-companies-23andme">selling that data</a> on to other companies.</p>
<p>Now healthcare providers are following suit by encouraging patients to take genetic tests that will create databases ostensibly for medical research. Britain’s National Health Service (NHS) <a href="https://theconversation.com/nhs-plan-to-sell-genome-sequencing-to-healthy-people-is-premature-110619">recently announced</a> that it was launching such a scheme in an attempt to build a database of anonymised genetic data for researchers.</p>
<p>But <a href="https://www.irishtimes.com/news/health/hospital-investigates-release-of-dna-samples-to-research-firm-1.3773529">recent reports</a> that Our Lady’s Children’s Hospital, Crumlin in Dublin – Ireland’s largest children’s hospital – allegedly shared patient DNA data with a private firm without appropriate consent highlights the potential risk that comes with giving up your genetic records. Your DNA contains sensitive information that can be used to make important personal decisions about you and your family members. When you hand over these details to a large database – whoever is building it – you are ultimately risking it being used in ways you can’t foresee and which aren’t always to your benefit.</p>
<p>The first questions are where your data will end up and who will have access to it. The NHS is attempting to keep control of the genetic data it gathers by sharing it with researchers at its own company, <a href="https://www.genomicsengland.co.uk/">Genomics England</a>. But there has been no indication of what purposes the data can be used for, or what limits will be placed on its use or transfer to other research centres or companies. In the past, Genomics England <a href="https://theconversation.com/google-may-get-access-to-genomic-patient-data-heres-why-we-should-be-concerned-80417">met with Google</a> to discuss how the tech firm might help analyse genetic data gathered under a previous scheme, the <a href="https://www.genomicsengland.co.uk/about-genomics-england/the-100000-genomes-project/">100,000 Genomes Project</a>. </p>
<p>A spokesperson for Genomics England told The Conversation that it had “no formal contractual relationship between Genomics England and Google”. However, it said: “We have a mutual interest in secure data storage and we have meetings from time to time. As part of our mandate to stimulate the UK genomics industry, we are in touch with Google Ventures. They invest in life sciences companies which may be interested in working with us.”</p>
<p>The recent Irish example of data transfer apparently without appropriate consent also reminds us that agreements and rules over who can access data can be broken. In January 2019, <a href="https://www.thetimes.co.uk/edition/ireland/crumlin-hospital-sent-dna-off-without-consent-mm5crwng0">an investigation was launched</a> into the alleged supply of 1,500 DNA samples from the Crumlin children’s hospital to Genomics Medicine Ireland (GMI) without proper authorisation from patients. </p>
<p>If these allegations are true, it would represent a breach of European data protection law, which requires explicit consent for the processing of DNA data. What is perhaps <a href="https://ieeexplore.ieee.org/document/8470173">more of a problem</a> is that even when people are told what will happen with their data, they may not understand those uses or its <a href="https://philpapers.org/rec/SCHTCO-98">potential consequences</a>.</p>
<p>Initiatives such as the NHS project are justified by claims that they offer an efficient way to <a href="https://theconversation.com/why-the-100-000-genomes-project-will-focus-on-rare-diseases-36155">diagnose rare</a> or undiscovered illnesses, speeding up treatment and improving patient outcomes. More broadly, proponents argue, <a href="https://theconversation.com/how-big-data-is-being-mobilised-in-the-fight-against-leukaemia-74281">sharing DNA data</a> can allow researchers to spot patterns that would otherwise go unidentified, increasing scientific understanding and aiding in the development of treatments.</p>
<p>But having your DNA sequenced isn’t just a way of finding out if you are at risk of a disease or making an altruistic contribution to an abstract research project. DNA data exposes our most inherent characteristics, revealing ethnic or racial groupings, as well as outlining current and future health issues. Some people have even tried to link <a href="https://www.technologyreview.com/s/610339/dna-tests-for-iq-are-coming-but-it-might-not-be-smart-to-take-one/">DNA tests to intelligence</a>.</p>
<figure class="align-center ">
<img alt="" src="https://images.theconversation.com/files/256608/original/file-20190131-112314-1ylmkh4.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/256608/original/file-20190131-112314-1ylmkh4.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/256608/original/file-20190131-112314-1ylmkh4.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/256608/original/file-20190131-112314-1ylmkh4.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/256608/original/file-20190131-112314-1ylmkh4.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/256608/original/file-20190131-112314-1ylmkh4.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/256608/original/file-20190131-112314-1ylmkh4.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">DNA files can easily be transferred.</span>
<span class="attribution"><a class="source" href="https://www.shutterstock.com/image-photo/genetic-engineer-working-analysis-dna-software-509467522?src=FqZaNnPBF2eO5RdpUEstRA-1-28">Angellodeco/Shutterstock</a></span>
</figcaption>
</figure>
<p>Concerns about linking individuals to the characteristics revealed by their DNA are usually countered by claims that the data is anonymised. But both <a href="https://www.telegraph.co.uk/news/health/news/10656893/Hospital-records-of-all-NHS-patients-sold-to-insurers.html">practical experience</a> and <a href="http://science.sciencemag.org/content/347/6221/536.full#ref-26">academic work</a> have shown that anonymised data can often be <a href="https://www.nature.com/articles/srep01376">reassociated with</a> the people it was collected from.</p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/your-nhs-data-is-completely-anonymous-until-it-isnt-22924">Your NHS data is completely anonymous – until it isn't</a>
</strong>
</em>
</p>
<hr>
<p>So sharing your genetic information could expose you to potential discrimination if it ends up with the wrong people or is used for the wrong purposes. Being offered different health insurance coverage and at different prices is the most obvious risk. But depending on who buys the data, pharmaceutical companies, employers and even government authorities could access your DNA and <a href="https://theconversation.com/four-ways-your-google-searches-and-social-media-affect-your-opportunities-in-life-96809">make decisions</a> based on it.</p>
<p>Democratic governments can’t typically gather DNA evidence without the permission of a judge or via another legal procedure. But in the case of the “<a href="http://www.sciencemag.org/news/2018/10/we-will-find-you-dna-search-used-nab-golden-state-killer-can-home-about-60-white">Golden State Killer</a>”, US law enforcement agencies used DNA data from a public genealogy database to obtain evidence they wouldn’t otherwise have been able to collect. This raises concerns about the willingness of governments to use genetic records originally made to explore people’s ancestry for a very different purpose.</p>
<h2>Giving away family secrets</h2>
<p>The Golden State Killer case is all the more important because it highlights the most fundamental issue with DNA-sharing initiatives. When you share your DNA, you’re also sharing data about your entire family, who haven’t necessarily consented. The Golden State Killer didn’t get a DNA test but one of his relatives did. When enough people share their DNA, the genetic profile of entire communities becomes available.</p>
<p>A <a href="http://www.sciencemag.org/news/2018/10/we-will-find-you-dna-search-used-nab-golden-state-killer-can-home-about-60-white">study of the database</a> that was used to catch the killer estimated that it contained the profiles of 0.5% of the US population, yet this represented family members (third cousin or closer) of 60% of white Americans. With 2% of the population, that figure would increase to 90%.</p>
<p>GMI currently <a href="https://www.genengnews.com/insights/using-powered-cohorts-to-speed-drug-discovery-and-development/">plans to build</a> the world’s largest whole-genome database of some 400,000 participants – roughly a tenth of Ireland’s population – from a presence in all the country’s major hospitals. This would likely give the firm information on almost every family group in Ireland and a huge proportion of the Irish diaspora (<a href="https://www.dfa.ie/media/dfa/alldfawebsitemedia/newspress/publications/ministersbrief-june2017/1--Global-Irish-in-Numbers.pdf">estimated at 70m</a>), enabling it to identify the most private characteristics of a global population.</p>
<p>This shows how, when some people allow their DNA data to be shared, it could expose both them and their families to risk and erode the rights of everyone else, meaning we all have a stake in how genetic records are shared. Organisations must be required to be clearer about who will use the DNA data they collect, and for what to prevent risk of misuse.</p><img src="https://counter.theconversation.com/content/110755/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Roisin Costello receives funding from The Irish Research Council. </span></em></p>When you share your genetic data – even with the NHS – you don’t know where it will end up, or how it will be used.Roisin Costello, PhD Candidate, School of Law, Trinity College DublinLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/862742018-01-05T11:57:29Z2018-01-05T11:57:29ZThe libraries of the future will be made of DNA<figure><img src="https://images.theconversation.com/files/197638/original/file-20171204-22977-17hjfs8.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">
</span> <span class="attribution"><span class="source">Jezper/Shutterstock.com</span></span></figcaption></figure><p>There are <a href="http://www.internetlivestats.com/twitter-statistics/">6,000 tweets</a> sent a second. In the time you have read this sentence, 42,000 tweets will have been sent. At an average of <a href="http://www.independent.co.uk/life-style/gadgets-and-tech/news/twitter-character-limit-update-tweets-expanded-140-280-english-japanese-app-a7968961.html">34 characters per tweet</a> that’s 1,428,000 characters.</p>
<p><a href="http://worldwidewebsize.com/">Worldwidewebsize</a> daily estimates the size of the internet. On the day of writing, it amounted to 4.59 billion pages and a billion websites. This is the “indexed” internet, and doesn’t include the “dark web” or private databases. </p>
<p>The size of the web is measured in two ways. The first is “content” – storage capacity was <a href="https://www.livescience.com/54094-how-big-is-the-internet.html">estimated</a> in 2014 as 10<sup>24</sup> bytes, or a million <a href="https://en.wikipedia.org/wiki/Exabyte">exabytes</a>. The second is “traffic”, measured in <a href="https://en.wikipedia.org/wiki/Zettabyte">zettabytes</a>. Global traffic recently <a href="https://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/vni-hyperconnectivity-wp.html">passed</a> one zettabyte, the content of 250 billion DVDs. </p>
<p>More conventionally, the <a href="https://www.theguardian.com/books/2014/oct/22/uk-publishes-more-books-per-capita-million-report">UK published</a> 184,000 books in 2013 – globally, the largest number <a href="https://en.wikipedia.org/wiki/Books_published_per_country_per_year#cite_note-publishingtechnology.com-2">per inhabitant</a>. Add the increasing ways of measuring a human being in terms of data – DNA sequencing, online family trees, genetic coding, bank accounts, online information of all kinds – or the amount of scientific data being produced and read <a href="https://skatelescope.org/news/raeng-grant-to-engage-with-ska-engineering/">around the world</a> and the amount of information in the world is staggering. Even the amount of storage most people need for photos and documents has grown hugely in the past few years.</p>
<p>As a species, we are producing information at a <a href="https://www.youtube.com/watch?v=iIKPjOuwqHo">massive rate</a>. The “reading” of the mass of data has led to new predictive models for <a href="https://www.amazon.co.uk/Big-Data-Revolution-Transform-Think/dp/1848547927">social interaction</a>. Businesses and governments are scrambling to make use of this data as human beings seem ever more readable, manageable and – possibly – controllable through the comprehension and manipulation of information.</p>
<p>But just how might all this information be stored? At present, we have physical libraries, and physical archives, and bookshelves. The internet itself is “stored” on hard-disk servers around the world, using enormous amounts of power to keep them cool. Online infrastructure is expensive, energy hungry, and vulnerable; its longevity is <a href="http://uk.businessinsider.com/facebook-fires-four-more-shots-into-the-server-market-2017-3?r=US&IR=T">also limited</a> – see <a href="https://en.wikipedia.org/wiki/Live_Free_or_Die_Hard#Plot">Die Hard 4.0</a> for a dramatisation of this.</p>
<figure class="align-center ">
<img alt="" src="https://images.theconversation.com/files/197639/original/file-20171204-22962-16lpmyk.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/197639/original/file-20171204-22962-16lpmyk.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=420&fit=crop&dpr=1 600w, https://images.theconversation.com/files/197639/original/file-20171204-22962-16lpmyk.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=420&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/197639/original/file-20171204-22962-16lpmyk.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=420&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/197639/original/file-20171204-22962-16lpmyk.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=528&fit=crop&dpr=1 754w, https://images.theconversation.com/files/197639/original/file-20171204-22962-16lpmyk.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=528&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/197639/original/file-20171204-22962-16lpmyk.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=528&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Data centres such as this one may soon be a thing of the past.</span>
<span class="attribution"><span class="source">Gorodenkoff/Shutterstock.com</span></span>
</figcaption>
</figure>
<h2>Libraries of the future</h2>
<p>The future of information storage may sound dull, but it is a crucial issue for anyone interested in the way that societies remember. A good example is family history, where public archives, such as census records and tax information, are increasingly accessed online. Millions of users around the world use subscription sites such as Ancestry or Findmypast to access this public information and to create their family trees using online software. This proliferation of information raises ethical issues about access (public records being used by private companies to make a profit) and about how this data is stored, managed and used.</p>
<p>We all have a stake in the way that libraries and archives might work in the future, how they might be configured, and what might be stored – and why. Do we really need to store every tweet ever sent? Making any kind of choice over what to store – what to collect, commemorate, archive – provokes a complex discussion. Technologies for accessing – “reading” – information need to be somehow futureproofed, or we will end up with huge amounts of information that cannot be used.</p>
<p>So: what to do? There are wide-ranging discussions at present, from what information to store (including various <a href="https://ntrs.nasa.gov/search.jsp?R=20170004513">biobanks</a> full of <a href="http://www.croptrust.org/main/content/svalbard-global-seed-vault">biological specimens</a>), to how to store it, to where to store it (the Arctic, <a href="http://www.sciencedirect.com/science/article/pii/S009457651630100X">various locations in space</a>, under water). Most of these discussions are occurring within scientific communities; some <a href="https://dspace.mit.edu/handle/1721.1/110132">technological companies</a> are involved. Those who have spent years thinking about memory, commemoration and archiving – historians and librarians – are often on the fringes of the discussion.</p>
<figure class="align-center ">
<img alt="" src="https://images.theconversation.com/files/197640/original/file-20171204-22996-1yxss3e.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/197640/original/file-20171204-22996-1yxss3e.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=420&fit=crop&dpr=1 600w, https://images.theconversation.com/files/197640/original/file-20171204-22996-1yxss3e.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=420&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/197640/original/file-20171204-22996-1yxss3e.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=420&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/197640/original/file-20171204-22996-1yxss3e.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=528&fit=crop&dpr=1 754w, https://images.theconversation.com/files/197640/original/file-20171204-22996-1yxss3e.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=528&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/197640/original/file-20171204-22996-1yxss3e.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=528&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Stored information, old style.</span>
<span class="attribution"><span class="source">By kurbanov/Shutterstock.com</span></span>
</figcaption>
</figure>
<h2>Nanocrystals and DNA</h2>
<p>Various different organisations are exploring physical ways of storing humanity’s information. Physical storage on nickel disks (read by microscope) or laser-written barcodes on silica glass have been suggested. Highly experimental – and at present energy-hungry – <a href="https://www.nature.com/articles/natrevmats201670">nanotechnology</a> looks to write information at the near-molecular level (although the use of the word “write” is very much out of date here). Nanotechnological storage would be “read” through sophisticated microscopy and is sometimes the “effect” of chemical change or quite complicated processes, such as nanocrystals converting radiation (infra-red) into something “visible”. Some of the more baroque storage models range from a flash data memory vault on the moon to <a href="http://www.timecapsuletomars.com/">private companies sending digital content</a> to Mars, to <a href="http://www.keo.org/uk/pages/faq.html#q3">satellites orbiting the earth</a>.</p>
<p>But most of the activity at present seems to be biological. Various scientists have begun to explore the possibility of using DNA to store <a href="http://www.nature.com/nature/journal/v494/n7435/abs/nature11875.html?foxtrotcallback=true">information</a>, called Nuclear Acid Memory (NAM). </p>
<p>This would involve the data being “translated” into the letters GATC, the base nucleic acids of DNA. DNA strands would then be created which could be translated back into the “original” by being sequenced. Researchers recently stored archival-quality versions of music by <a href="https://pitchfork.com/news/miles-davis-tutu-is-one-of-the-first-songs-to-be-encoded-in-dna/">Miles Davis and Deep Purple</a> and also of <a href="http://www.bbc.co.uk/news/av/science-environment-40585302/movie-encoded-into-the-dna-of-bacteria">a short GIF</a> in DNA form. </p>
<p>DNA is durable and increasingly easy to produce and read. It will keep for thousands of years in the right storage conditions. DNA might be stored anywhere that is dark, dry, cold, and arguably would not take up a great deal of room.</p>
<p>Much of this technology is in its infancy, but developments in nanotechnology and DNA sequencing suggest that we will be seeing the applied results of experimentation and development within years. Wider questions arise about the ethics of collection and to what extent these processes will become mainstream. Print, and to a certain extent digital, have become common and reasonably <a href="https://books.google.co.uk/books/about/The_Printing_Press_as_an_Agent_of_Change.html?id=0-FThHK2DNMC&redir_esc=y">democratic</a> ways of transmitting and storing information. It remains to be seen whether future storage and writing will be as easy to access, and who will be in control of humanity’s information and memory in the coming decades and centuries.</p><img src="https://counter.theconversation.com/content/86274/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Jerome de Groot receives funding from AHRC. </span></em></p>Technologies for accessing information need to be somehow future-proofed.Jerome de Groot, Senior Lecturer, University of ManchesterLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/881982017-12-10T23:05:02Z2017-12-10T23:05:02ZYou’ve got your DNA kit: Now what can you do with it?<figure><img src="https://images.theconversation.com/files/198431/original/file-20171210-27674-unj6wl.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">A scientist works with DNA samples in a New Orleans laboratory in 2011.</span> <span class="attribution"><span class="source"> (AP Photo/Gerald Herbert)</span></span></figcaption></figure><p>Differences among people, such as eye colour or hair colour, come from slight variations in our genetic code. As technology advances, it’s getting easier to unlock the secrets in our DNA to gain new insights into who we are and to apply that knowledge to dramatically change our lives and society.</p>
<p>This has led many to get personal reports on their own genetic code in exchange for payment and saliva samples. Excitement over these reports recently jumped after <a href="http://www.oprah.com/gift/oprahs-favorite-things-2017-full-list-dna-test-ancestry-personal-genetic-service?editors_pick_id=71355">Oprah Winfrey recommended the DNA test by 23andMe on her annual favourite things list</a>.</p>
<p>But the applications of making DNA information more accessible stretch far beyond satisfying our curiosity about who we are and what our genes might say about us. </p>
<p>The availability of genetic data can potentially be tapped to treat medical conditions, leading to personalized health care and wellness regimens, with larger implications for personal, cultural, social and economic change. For example, companies such as Newtopia provide customers with <a href="http://www.goodhousekeeping.com/institute/a23581/newtopia-a-diet-based-on-your-dna/">weight-loss plans that are tailored to one’s own DNA</a>.</p>
<p>As researchers trained in economics, we study the impact of how genetic and environmental factors influence the development of human capital measures such as education and health. As we learn more about our DNA, the possibilities that arise for policy and the economy as a whole are as numerous as our individual genomes are varied. </p>
<h2>DNA data can pose public risk</h2>
<p>Beyond private companies, the rapidly declining costs of both gene-sequencing and the technology to store genomic data has the potential to soon transform health-care delivery and policy. </p>
<p><a href="https://wol.iza.org/articles/what-is-the-role-for-molecular-genetic-data-in-public-policy/long">Our recent research considers the potential value from incorporating genetic data in the design of public policy</a> and <a href="https://link.springer.com/epdf/10.1186/s40173-017-0080-6?author_access_token=tokabk3A5sGAY9DDBwhlcW_BpE1tBhCbnbw3BuzI2RPxRCGr4ipav-alb6J3IvVA4EO0ta2k5g7yH1LrAwVB8rGq4ZAzBAu2B3WRSAmD5FG5bfMZsrSFzsV5pE6ZvgEdT4-nvwMYMHxmgD48yrHGTA==">social science research</a>, as well as the risks. </p>
<p>Decisions about genetic policies involve complex issues about ethics, costs, benefits and individual and societal interests. </p>
<p>Legislation is needed to prevent insurance companies and employers from using the results from genetic tests when making decisions. Canada was the last member of the G7 to introduce protections with the <a href="http://www.parl.ca/DocumentViewer/en/42-1/bill/S-201/royal-assent">Act to Prohibit and Prevent Genetic Discrimination</a> (formerly Bill S-201) this year — nine years after the United States passed similar legislation.</p>
<p>Since genetic factors may explain individual differences in socioeconomic outcomes, a growing number of social science data sets now involve biological-specimen collection activities that permit measuring genetic factors. Analyses of this data can extend and expand our knowledge on virtually every health condition — and on socioeconomic traits that have a genetic basis.</p>
<h2>Environment also plays a role</h2>
<p>However, genetic factors are only part of the story and other variables that are well-studied by social scientists —such as environment and lifestyle — also come into play. For example, an emerging body of evidence now indicates that genetic associations with <a href="http://www.pnas.org/content/112/2/354">obesity may vary due to different prevailing environmental factors</a> like occupation and even urban design.</p>
<p>These differences in environments, lifestyles and genetic factors have important implications in areas ranging from health behaviours such as obesity and cigarette smoking to skill development and other socio-economic outcomes. Therefore the idea of a one-size-fits-all policy for any health, education or socioeconomic outcome is flawed. </p>
<p>Adopting one-size-fits-all policies assume that the same process can produce a health or socioeconomic outcome for all individuals. However, if and how substantial genetic variations change the way these outcomes develop, opportunities emerge to create more effective treatments and policies.</p>
<p>Within the health-care realm, understanding the genetic basis of specific medical conditions is valuable since it offers the potential to improve treatment decisions.</p>
<p>With this new knowledge, we could replace current health and medical practices and develop new ones to target personalized policies and treatments more efficiently for different individuals.</p>
<h2>Heredity expands impact</h2>
<p>The intersection of genetics and public policy stretches beyond the health-care sector. <a href="http://psych.colorado.edu/%7Ecarey/hgss/hgssapplets/heritability/heritability.intro.html">Heritability</a> plays a role in nearly every socio-economic and education outcome. Heredity ensures policies that consider the role of genetics will have immediate and long-term implications.</p>
<p>The quality of evidence on the role of genetic factors on socioeconomic traits has increased sharply over the last decade.</p>
<p>With newer molecular DNA data available to empirical researchers, the flood of research findings linking specific genetic factors with individual health and socioeconomic outcomes will only continue to grow. </p>
<p>Yet it remains essential to ensure that these findings are interpreted correctly. Much of the evidence reflects only simple associations between individual genetic factors and socioeconomic outcomes — not causal relationships. And the impact of most genetic factors are often very small in magnitude.</p>
<h2>Small effects, big outcomes</h2>
<p>Nonetheless, there is often value from these findings. For example, <a href="http://brcatool.stanford.edu/">a calculator developed by the Stanford Cancer Institute</a> provides individuals with information on how their chances of survival change in response to different preventive measures taken at different ages.</p>
<p>The calculation is based on several specific differences in genetic markers, and helps educate individuals on the trade-offs they face when choosing among possible treatments.</p>
<p>More generally, the speed at which molecular genetic data can be effectively integrated within policy design is directly tied to improving our understanding how genetic markers operate. </p>
<p>For example, if genetic screening can reliably predict complex learning disorders, the advantages would be huge. Even if a disorder is a function of many genes — each with very small effects — researchers can calculate a single aggregate summary score.</p>
<figure class="align-center ">
<img alt="" src="https://images.theconversation.com/files/198433/original/file-20171210-27686-jmn12s.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/198433/original/file-20171210-27686-jmn12s.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/198433/original/file-20171210-27686-jmn12s.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/198433/original/file-20171210-27686-jmn12s.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/198433/original/file-20171210-27686-jmn12s.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/198433/original/file-20171210-27686-jmn12s.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/198433/original/file-20171210-27686-jmn12s.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Using genetic testing to determine your child has a learning disorder could help parents make the right decisions for their children.</span>
<span class="attribution"><span class="source">(Shutterstock)</span></span>
</figcaption>
</figure>
<p>The summary score would measure an individual’s risk for a specific disorder or trait, which, in many situations, may take psychologists years to diagnose.</p>
<p>Armed with knowledge of whether their child is at an elevated risk for a learning disorder or other conditions, for example, parents will be able to make different investments in their child years before receiving a formal diagnosis. </p>
<h2>Change the conversation fast</h2>
<p>These investments may additionally affect how the underlying genes manifest themselves and therefore reduce the risk for future poor outcomes. As knowledge advances, the predictive accuracy of these summary scores will increase.</p>
<p>All of this reinforces the need for policies that consider not only the benefits, but the potential costs, of this newly available genetic data source. </p>
<p>Whether Canadians will fully realize the significant potential benefits from incorporating genetic data in health and social policy design will depend on how fast policies that ensure appropriate safeguards are developed.</p>
<p>If Canada hopes to capitalize on the great potential of DNA data to improve the lives of Canadians, policymakers and stakeholders must determine how to maximize the benefits while minimizing the harm.</p>
<p>Just as it should have regarding the genetic discrimination law, Canada must take quicker action in the future to ensure its citizens benefit from the explosion of DNA data.</p><img src="https://counter.theconversation.com/content/88198/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Steven Lehrer receives funding from SSHRC. </span></em></p><p class="fine-print"><em><span>Weili Ding does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>The rapid growth of genetic testing and data-gathering could revolutionize health and medicine if governments work to protect people against privacy and societal risks.Steven Lehrer, Associate Professor of Economics, Queen's University, OntarioWeili Ding, Associate professor, Queen's University, OntarioLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/782262017-07-28T03:08:36Z2017-07-28T03:08:36ZStoring data in DNA brings nature into the digital universe<figure><img src="https://images.theconversation.com/files/179866/original/file-20170726-28585-x4xan9.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">The next frontier of data storage: DNA.</span> <span class="attribution"><a class="source" href="https://www.shutterstock.com/image-illustration/dna-molecules-binary-code-3d-render-255618778">ymgerman/Shutterstock.com</a></span></figcaption></figure><p>Humanity is producing data at an unimaginable rate, to the point that storage technologies can’t keep up. Every five years, the amount of data we’re producing increases <a href="https://www.emc.com/leadership/digital-universe/index.htm">10-fold</a>, including photos and videos. Not all of it needs to be stored, but manufacturers of data storage aren’t making hard drives and flash chips fast enough to <a href="http://www.seagate.com/our-story/data-age-2025/">hold what we do want to keep</a>. Since we’re not going to stop taking pictures and recording movies, we need to develop new ways to save them.</p>
<p>Over millennia, nature has evolved an incredible information storage medium – DNA. It evolved to store genetic information, blueprints for building proteins, but DNA can be used for many more purposes than just that. DNA is also much denser than modern storage media: The data on hundreds of thousands of DVDs could fit inside a <a href="https://dx.doi.org/10.1038/nature11875">matchbox-size package of DNA</a>. DNA is also much more durable – <a href="http://dx.doi.org/10.1002/anie.201411378">lasting thousands of years</a> – than today’s hard drives, which may last <a href="https://www.extremetech.com/computing/170748-how-long-do-hard-drives-actually-live-for">years or decades</a>. And while hard drive formats and connection standards become obsolete, DNA never will, at least so long as there’s life.</p>
<p>The idea of storing digital data in DNA is <a href="https://en.wikipedia.org/wiki/DNA_digital_data_storage">several decades old</a>, but recent work from <a href="http://dx.doi.org/10.1126/science.1226355">Harvard</a> and the <a href="https://dx.doi.org/10.1038/nature11875">European Bioinformatics Institute</a> showed that progress in modern DNA manipulation methods could make it both possible and practical today. Many research groups, including at the <a href="http://dx.doi.org/10.1002/anie.201411378">ETH Zurich</a>, the <a href="https://dx.doi.org/10.1038/srep14138">University of Illinois at Urbana-Champaign</a> and <a href="http://dx.doi.org/10.1126/science.aaj2038">Columbia University</a> are working on this problem. Our <a href="http://misl.cs.washington.edu/">own group</a> at the University of Washington and Microsoft <a href="https://www.washington.edu/news/2016/07/07/uw-microsoft-researchers-break-record-for-dna-data-storage/">holds the world record</a> for the amount of data successfully stored in and retrieved from DNA – 200 megabytes.</p>
<h2>Preparing bits to become atoms</h2>
<p>Traditional media like hard drives, thumb drives or DVDs store digital data by changing either the <a href="https://www.extremetech.com/computing/88078-how-a-hard-drive-works">magnetic</a>, <a href="http://computer.howstuffworks.com/flash-memory.htm">electrical</a> or <a href="https://www.pcmag.com/article2/0,2817,1820962,00.asp">optical properties</a> of a material to store 0s and 1s.</p>
<p>To store data in DNA, the concept is the same, but the process is different. DNA molecules are long sequences of smaller molecules, called nucleotides – adenine, cytosine, thymine and guanine, usually designated as A, C, T and G. Rather than creating sequences of 0s and 1s, as in electronic media, DNA storage uses sequences of the nucleotides.</p>
<p>There are several ways to do this, but the general idea is to assign digital data patterns to DNA nucleotides. For instance, 00 could be equivalent to A, 01 to C, 10 to T and 11 to G. To store a picture, for example, we start with its encoding as a digital file, like a JPEG. That file is, in essence, a long string of 0s and 1s. Let’s say the first eight bits of the file are 01111000; we break them into pairs – 01 11 10 00 – which correspond to C-G-T-A. That’s the order in which we join the nucleotides to form a DNA strand. </p>
<p>Digital computer files can be quite large – <a href="https://softwareengineering.stackexchange.com/questions/332069/what-is-a-realistic-real-world-maximum-size-for-a-sqlite-database">even terabytes in size for large databases</a>. But individual DNA strands have to be much shorter – holding only about 20 bytes each. That’s because the longer a DNA strand is, the harder it is to build chemically. </p>
<p>So we need to break the data into smaller chunks, and add to each an indicator of where in the sequence it falls. When it’s time to read the DNA-stored information, that indicator will ensure all the chunks of data stay in their proper order.</p>
<p>Now we have a plan for how to store the data. Next we have to actually do it.</p>
<h2>Storing the data</h2>
<p>After determining what order the letters should go in, the DNA sequences are manufactured letter by letter with chemical reactions. These reactions are driven by equipment that takes in bottles of A’s, C’s, G’s and T’s and mixes them in a liquid solution with other chemicals to control the reactions that specify the order of the physical DNA strands.</p>
<p>This process brings us another benefit of DNA storage: backup copies. Rather than making one strand at a time, the chemical reactions make many identical strands at once, before going on to make many copies of the next strand in the series.</p>
<p>Once the DNA strands are created, we need to protect them against damage from <a href="http://dx.doi.org/10.1002/anie.201411378">humidity and light</a>. So we dry them out and put them in a container that keeps them cold and blocks water and light. </p>
<p>But stored data are useful only if we can retrieve them later.</p>
<h2>Reading the data back</h2>
<p>To read the data back out of storage, we use a sequencing machine exactly like those used for analysis of <a href="https://en.wikipedia.org/wiki/DNA_sequencing">genomic DNA in cells</a>. This identifies the molecules, generating a letter sequence per molecule, which we then decode into a binary sequence of 0s and 1s in order. This process can destroy the DNA as it is read – but that’s where those backup copies come into play: There are many copies of each sequence.</p>
<p>And if the backup copies get depleted, it is easy to make duplicate copies to refill the storage – just as nature <a href="https://en.wikipedia.org/wiki/Polymerase_chain_reaction">copies DNA all the time</a>.</p>
<p>At the moment, most DNA retrieval systems require reading all of the information stored in a particular container, even if we want only a small amount of it. This is like reading an entire hard drive’s worth of information just to find one email message. We have developed techniques – based on <a href="http://www.jstor.org/stable/1700278">well-studied biochemistry methods</a> – that let us <a href="https://doi.org/10.1101/114553">identify and read</a> only the <a href="http://dx.doi.org/10.1145/2872362.2872397">specific pieces of information</a> a user needs to retrieve from DNA storage.</p>
<h2>Remaining challenges</h2>
<p>At present, DNA storage is experimental. Before it becomes commonplace, it needs to be completely automated, and the processes of both building DNA and reading it must be improved. They are both prone to error and relatively slow. For example, today’s DNA synthesis lets us write a few <a href="https://synbiobeta.com/time-new-dna-synthesis-sequencing-cost-curves-rob-carlson/">hundred bytes per second</a>; a modern hard drive can write <a href="https://www.lifewire.com/what-are-read-and-write-speeds-2640236">hundreds of millions of bytes per second</a>. An average iPhone photo would take several hours to store in DNA, though it takes less than a second to save on the phone or transfer to a computer. </p>
<p>These are significant challenges, but we are optimistic because all the relevant technologies are improving rapidly. Further, DNA data storage doesn’t need the perfect accuracy that biology requires, so researchers are likely to find even cheaper and faster ways to store information in nature’s oldest data storage system.</p><img src="https://counter.theconversation.com/content/78226/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Luis Ceze works for University of Washington and consults for Microsoft Research. He receives funding from Microsoft, NSF and DARPA.</span></em></p><p class="fine-print"><em><span>Karin Strauss works for Microsoft Research and is an affiliate faculty member at University of Washington. She is also a member of the Institute of Electrical and Electronics Engineers (IEEE), a member of the Association for Computing Machinery (ACM), and an Executive Committee member of the ACM's Special Interest Group on Computer Architecture (SIGARCH). </span></em></p>Researchers who hold the world record for storing and retrieving data in DNA explain how the building blocks of life can be used to hold digital information as well.Luis Ceze, Associate Professor of Computer Science and Engineering, University of WashingtonKarin Strauss, Researcher in Computer Architecture, Microsoft Research; Affiliate Associate Professor of Computer Science and Engineering, University of WashingtonLicensed as Creative Commons – attribution, no derivatives.