Menu Close
There have never been more ways to monitor our personal health and well-being. everything possible/Shutterstock

Big data can improve health but first we need to build the foundations

“What if we, as government, got out of the way and gave consumers full access to their own personalised health data and full control over how they choose to use it?” Health Minister Sussan Ley asked in her recent speech to the National Press Club.

Ley sketched out a new health landscape populated by consumers who shared their personal e-health records with app developers, dietitians and retailers in return for products and services tailored to their particular health needs.

“The great digital health revolution,” the minister concluded, “lies literally in the palms of consumers, rather than government.”

On one level this rings true. There have never been more ways to monitor our personal health and well-being, and share and compare our findings. We can track our activity, diet, exercise, emotions and sleeping habits on our mobiles, Fitbits, Apple watches and apps. We can even have our genomes sequenced.

And with Ley’s announcement we may now start to see a real upswing in people accessing and using their own medical record data.

But the availability of data is just the starting point – we then need to make sense of the data.


Conventionally, insights from health data have come from research studies that test hypotheses by systematically collecting and analysing data. Findings are published in scientific journals and pooled to determine the “bottom line” on any given health topic, using a process called “evidence synthesis”. This information is then used to create the guidelines and policies that shape health-care practices.

For instance, when pharmaceutical companies develop a new drug, they conduct a set of research studies and estimate the benefits and harms of the drug by combining data from these studies. They submit a summary of these findings – together with costs and comparisons with alternative drugs – to the government for marketing approval and reimbursement through the Pharmaceutical Benefits Scheme.

But the side effects of drugs are not always apparent at the time of marketing approval. This is because initial studies are often relatively small, with short follow-up times, selected (“clean”) study populations and a modest set of outcomes.

Systems that monitor drug effects in large and diverse populations over time can therefore add a lot to our understanding of a drug’s real effects. The Sentinel system in the US and CNODES in Canada, for instance, use electronic medical record data from millions of people to better understand benefits and harms of drugs.

An even clearer understanding of a drug’s effect might come from incorporating genetic data. US President Barack Obama’s Precision Medicine Initiative seeks to identify new ways to treat cancer and other conditions by analysing the genomes and electronic medical record data of a million people.

We can also capture additional data from social platforms where people contribute their own experiences with illnesses and treatments, such as Patients Like Me and Iodine.


The greatest value is often generated when different types of data are combined, such as genomic and medical record data. Or clinic data and Google search terms can be used to track a fast-moving flu epidemic.

But current systems are not up to the task of combining and making sense of our increasingly rich and diverse data – from genomes to Facebook profiles. As my colleagues and I argue in the current issue of Nature, consumers and health professionals can’t make the most of the abundance of health data until we build systems to efficiently and reliably convert diverse data into knowledge.

Scientists need to work out:

  • why, when and how to combine different types of data
  • how each data source’s strengths and weaknesses can be taken into account
  • the technical systems able to capture the required metadata (data about data).

Bringing big and diverse data together will require new methods and systems to be built by collaborations between computer scientists, health researchers, experts in evidence synthesis and others.

Importantly, these systems must be built in a way that enables consumer trust. In Ley’s “great digital health revolution”, new products and services will ingest our personal health data and suggest to us and our doctors what treatment we should take.

They will be delivered to us by “decision support” systems, which will inevitably be proprietary, complex and dynamic, such as those deployed within IBM Watson. How will we trust these recommendations?

Data is the new oil of the 21st century, but is not the engines, factories or transport systems. Their digital equivalents are being built now and how we – individuals, corporations, governments and societies – build these systems, products and services will have far-reaching consequences, in health as in other sectors.

Want to write?

Write an article and join a growing community of more than 179,400 academics and researchers from 4,902 institutions.

Register now