The 23andMe data breach reveals the vulnerabilities of our interconnected data

On Oct. 6, news broke that 23andMe, the genomics company that collects genetic material from thousands of people for ancestry and genetic predisposition tests, had a massive data breach.

But as it turns out, the company’s servers were not hacked. Rather, hackers targeted hundreds of individual user accounts — allegedly those that had repeated passwords. After gaining access to the accounts, hackers could leverage the “DNA relatives matches” function of 23andMe to get information about thousands of other people.

This data breach challenges how we think about privacy, data security and corporate accountability in the information economy.

Hackers targeted user passwords to access 23andMe’s user data.

Shared information

Genetic information databases have a notable feature: anyone’s DNA data also reveals information about others who share part of their genetic code with them. When someone sends a sample to 23andMe, the company has genetic information about that person and their relatives even if those relatives didn’t send a sample or consent to any data collection. Their data is inevitably intertwined.

This isn’t just a characteristic of genetic data. Most data is about more than one person because data often describes shared features between people.

The ramifications of overlooking how personal data affects others extend to the entire information economy. Every individual choice about personal data has spillover effects on others. People are exposed to consequences — ranging from financial loss to discrimination — stemming from data practices that depend not only on information about themselves, but also on information about others.

User data-collection agreements can lead to indirect harm to third parties. For example, the negative impacts of the Cambridge Analytica scandal extended far beyond those whose data the company collected.

This predicament underscores the collective impact of individual data decisions.

Data analytics

Algorithms powered by artificial intelligence draw inferences by analyzing the relationships between data points. AI algorithms rely on databases containing information about multiple people to learn things about a particular person or a particular group.

Companies draw conclusions about people by analyzing data collected from others, making probabilistic assessments based on personal characteristics and relationships. Companies continue to add information about people to their datasets daily. And, the more people a dataset like the one built by 23andMe includes, the less someone’s choice not to be part of it matters.

a thumb presses a heart button on a smartphone screen — AI-powered algorithms analyze user information and the connections and relationships with other people’s data. (Shutterstock)

Similarly, every time a user agrees to the collection, processing or sharing of personal information, it also affects others who share similarities with the user. These collective assessments make data processing profitable, such as through marketing, data sales and business decisions based on consumer behaviour.

Equity issues

The interconnected nature of data isn’t a coincidence — it’s at the core of how businesses operate in the information economy. This also creates equity issues.

In the 23andMe case, hackers are offering the assembled genetic information for sale, with lists that include thousands of people. Hackers reportedly assembled and put up for sale lists of people with Ashkenazi Jewish ancestry.

Individuals on the list now face increased risk of discrimination or harassment, as leaked data includes names and location. Other information from the company would allow them to do the same for people with a propensity for Type 2 diabetes, Parkinson’s disease or dementia — all of which 23andMe measures — putting them at risk of other harms, from raised insurance premiums to employment discrimination.

Data’s collective risks

We often fail to acknowledge the interconnected nature of data because we’re fixated on each individual. As a consequence, companies can exploit one person’s agreement to legitimize data practices involving others. Companies’ legal obligations to obtain individual agreements for data collection fail to recognize broader interests beyond those of the person who agreed.

We need privacy laws attuned to how the information economy works. Providing consent on behalf of others, as 23andMe users did when they clicked “I agree,” would be illegitimate under any meaningful notion of consent. To contain group data harms like those this hack produced, we need substantive rules about what companies can and can’t do.

Prohibitions on indiscriminate data collection and risky data uses avoid leaving unsuspecting individuals as collateral damage. Because corporate data practices can impact everyone, their safety obligations should too.

This is a corrected version of a story originally published on Oct. 22, 2023. The earlier story incorrectly said that 23andMe was owned by Google, the data breach was as a result of weak rather than repeated passwords, and affected the public, rather than users of 23andMe’s services who opted into the “DNA relatives matches” feature.

The 23andMe data breach reveals the vulnerabilities of our interconnected data

Author

Disclosure statement

Partners

Shared information

Data analytics

Equity issues

Data’s collective risks

Want to write?