Over the past few years, there has been a growing awareness that many experimentally established “facts” don’t seem to hold up to repeated investigation.
This was highlighted in a 2010 article in the New Yorker entitled The Truth Wears Off and since then, there have been many popular press accounts of different aspects of science’s current reproducibility crisis.
Articles in recent days have discussed how the majority of scientists might be more interested in funding and fame than “truth” and are becoming increasingly reluctant to share unpublished details of their work.
So why exactly is science in such a crisis - and where do we start fixing it?
What caused the reproducibility crisis?
Behind these (somewhat arbitrary) triggers, however, are the same underlying causes: a combination of mechanised reporting of statistical results and publication bias towards “statistically significant” results.
So is the crisis a result of scientific fraud?
Not really. Well, maybe a bit. The number of known cases of outright fraud is very low. But what we might consider softer fraud — or “undisclosed flexibility” in data collection — is well documented and appears to be very widespread.
There can be little doubt that the “publish or perish” research environment fuels this fire. Funding bodies and academic journals that value “novelty” over replication deserve blame too.
While no-one knows the true level of undetected scientific fraud, the best way to deal with this problem is to increase the number of replication studies.
How do we fix it?
In biomedicine, there’s the Reproducibility Initiative. It’s backed by the Science Exchange, the journal PLOS ONE, Figshare, and Mendeley. It will initially be accepting 40 to 50 studies for replication with the results of the studies to be published in PLOS ONE.
There are also various other proposals such as
- a “reproducibility index” for journals, similar to an impact factor
- changes to the regulations of funding bodies
- random audits.
The proposals and initiatives mentioned above draw attention to improving methodological protocols, and require a more thoughtful approach to statistical reporting practises.
We might broadly consider these to be issues of researcher integrity. But instruction in research ethics alone is unlikely to be sufficient. Enabling others to replicate studies published across all areas of science will also require changes in the way scientists prepare, submit and peer review journal articles, as well as changes in how science is funded.
This points to a new way of doing science, which can loosely be called “open science”. This could include new practices such as open peer-review, and open notebook science and there are already platforms being developed to support these approaches.
Publishing computer source code and supporting data sets with academic articles will be an important change in making research more reproducible. This is a pressing issue with the increasing use of large data sets, computer simulation and sophisticated statistical analysis across many areas of science.
Although some fields of science have developed further in this direction than others, there has recently been a proliferation of services to support scientists publishing data and source code. This includes services such as Figshare, [RunMyCode](http://www.runmycode.org/, Dryad and the Dataverse Network.
In addition there is currently a push to give researchers a greater incentive to publish their data by making scientific datasets citable contributions to the scholarly record and with associated journals such as GigaScience and Earth System Science Data.
While opportunities to share raw data associated with a journal publication are growing, currently only around 9% of articles do so.
Before we assume this is a moral failing on the part of the authors of these articles, we should consider that there are many practical hurdles involved. In many areas of science, researchers are not trained in data curation, version control of source code or other methodologies required for research to be replicable.
Meeting the challenge
Data sharing and other procedures outlined here can be time-consuming, and currently provide little academic reward. Instruction in these skills will eventually need to become part of mainstream science education.
Methodology and statistics courses are one obvious place for them to find a home. The ethics of the reproducibility and open science movements are hard to dispute, but success will depend on how well we rise to meet associated practical and pedagogic challenges.