Statistics Canada recently released its 2016 census data, which depicts a Canada that is more diverse than ever before. Today, nearly a quarter of Canadians belong to a “visible minority” and 1.7 million Canadians are Indigenous.
Having only recently returned to this country after finishing my PhD at the University of Oxford, where I studied genome sequencing and its impact on clinical practice, I have been constantly reminded of our diversity. And yet I know that this diversity is still entirely lacking from our genomic data.
Today, genomics is quickly becoming integrated into our health-care system. It is providing new targeted treatments for cancer, creating personalized drug regimens, and uncovering diagnoses for rare diseases that were previously genetic mysteries. However, the lack of diversity in existing genomic data limits the ability of ethnic minorities — including Indigenous Canadians — to benefit from these advances in health care. And this perpetuates the very inequalities that caused the problem in the first place.
White Canadians, incomplete databases
Our genomes are recipe books for who we are. Individual variation in our genes — like changes in individual recipes — can provide extremely useful health information. This includes information on our risk of developing serious diseases, ranging from cancers to fatal heart conditions.
Today, when a health-care provider suspects that someone may have one of these genetic illnesses, they are able to read his or her genome sequence from cover to cover. Due to decreasing costs, use of this technology has skyrocketed. Genome-based testing is helping provide diagnoses that were previously unimaginable — from new genetic causes of early-onset epilepsy and birth defects affecting the heart, to new drivers of cancer.
An important step in the analysis process is determining how often a genetic “variant” occurs in healthy people. If a variant is too common in the wider population, it is unlikely to be harmful. Although we don’t have genome data for everyone, we do have extremely large databases of genome data from healthy individuals. We can use these as stand-ins for the wider population, to determine “normal” variation and which variants may cause problems.
Unfortunately, while growing in size, these databases are imperfect. There is normal genetic variation between human groups of different geographical origin. DNA-based ancestry tests, like 23andMe, use these differences to help tell you where you are from.
Genomic databases, however, mostly contain data from individuals of European descent. As a result, the data doesn’t actually reflect the wider population, but the wider white population. This is a product of historical biases and inequalities, and has real consequences for patients from groups who aren’t well represented.
Misdiagnoses among African Americans
Hypertrophic cardiomyopathy is an inherited heart condition that can cause sudden cardiac death. Cases often make headlines when professional athletes die suddenly, like NBA-bound rising star Hank Gathers’ tragic death in 1990.
A study published last year, however, showed that some African Americans were being misdiagnosed with the disease. This was not a biological problem, but a social one. Individuals of African descent were under-represented in the databases used for comparison. Variants that health-care providers thought caused the disease, and were regularly used to provide patients with positive diagnoses, were actually found to be common among healthy African Americans.
Instead of causing disease, they were simply ethnic differences with no harmful effect whatsoever. For individuals of African descent, these variants were “normal.” Patients were sent home with diagnoses of hypertrophic cardiomyopathy who didn’t actually have the condition.
In the sports industry, to avoid tragic deaths like Gathers’, athletes are often screened for these heart conditions. Many, such as Gathers and King McClure (pictured above), have been diagnosed correctly. But experts have said that the careers of some future sport stars may have been wrongfully cut short on the basis of these harmless gene variants.
Uncertainty for those of Asian descent
This problem is not specific to this heart condition, but true of all genetic diseases diagnosed in the same way. The problem is also not specific to African Americans — the same is possible for all ethnic groups missing from these databases.
A recent study showed that individuals of Asian descent were much more likely to receive uncertain results from genome sequencing than if they were white. This means that their health-care providers were unable to provide them with a diagnosis through genome sequencing, unlike their white counterparts.
There is also a greater chance they might receive unnecessary preventative treatment as a result, including drugs and even surgeries.
Indigenous peoples left behind
In recent years, with the increased use of genome sequencing, these databases have grown enormously in size. Through international collaboration, the Genome Aggregation Database now has sequence data for over 100,000 individuals.
This has dramatically improved our ability to analyse very rare variants, helping us to better diagnose the diseases they cause. This has included a push to increase representation of some groups: today individuals of African and Asian descent are much better-represented than they were ten years ago.
Representation from other groups, however, is still lacking.
Notably, in Canada, genome sequence data from Indigenous peoples is almost entirely missing. This is particularly problematic, because individuals from these communities often present with genetic conditions unique to their communities. As a consequence, they are often extremely difficult to diagnose at the genetic level.
Without a genetic diagnosis, it is nearly impossible to predict who else in a family is at risk. This hinders our ability to prevent, manage and treat genetic diseases in these populations.
Disrupting the equilibrium
There is a way forward. Recruitment of individuals from under-represented groups, even in small numbers, can dramatically improve our ability to diagnose genetic disease. Collecting more diverse data will allow us to broaden our definition of “normal” variation beyond that of white Europeans.
To do so, it is essential that we acknowledge the biases of the past. We must work to understand and address the barriers (both systematic and cultural) to enrolling individuals from underrepresented ethnic groups. Only by doing so can we collect the data needed to provide individuals in these groups with the same level of care as the rest of the population.
It is essential that we disrupt this equilibrium before it’s too late. We must all work together to diversify genomic databases so that everyone can benefit equally from the genetic revolution.