Scientific data should be shared: an open letter to the ARC

Science (real science, not the summaries in popular books and the media) is needlessly closed to the outside world. Worse, it is closed within itself, with every lab its own silo, and little sharing of data or materials. Top researchers work frantically behind the scenes, but only occasionally do they…

4jwsdxxm-1347251873
Science should be conducted outside its walled garden, in full view of the public. lovelihood

Science (real science, not the summaries in popular books and the media) is needlessly closed to the outside world. Worse, it is closed within itself, with every lab its own silo, and little sharing of data or materials.

Top researchers work frantically behind the scenes, but only occasionally do they share their results, by describing selected experiments in publications. This way of working may have been sensible in the paper past, but as the digital age advances it should not be the way we work now.

Recently one of us (MT) directed a fully open project in which all data and ideas were shared online. People completely unknown to the existing team came to the project and helped.

Some were employed by corporate laboratories but nevertheless provided significant inputs free of charge. This should not be surprising, as it is in our nature to help others solve problems. It is also in our nature to compete, and the open nature of the project yielded competition that helped achieve a common end. On this playing field, the greatest expertise could win out.

In an ongoing project involving open source drug discovery, a scientific dispute was settled by a discussion between an undergraduate and a professor in public view – the very best kind of educational practice.

Elsewhere in the world, a Dutch schoolteacher discovered a mysterious astronomical object by participating in the open GalaxyZoo project and was included as an author on the resulting publication.

More recently a high school student was able to devise an algorithm for detecting pancreatic cancer by reading free online scientific journals.

Our desire to see science thrive outside its walled garden and in full view of the public motivated us to write an open letter to the new CEO of the Australian Research Council (ARC).

That letter is included in full below. If readers agree with the letter, we invite them to add their signature to it by clicking on the link at the bottom of the letter. If a more cooperative, open research culture was actively encouraged by funding agencies we believe science would be more efficient and powerful.

The way the ARC evaluates scientists inadvertently fuels the secretive competition we scientists can get caught up in. We focus on competing for publications in prestigious journals. But the ARC can change its policies to lead us towards openness.

The ARC could make data sharing a condition of the grants it gives us.

An open letter to Professor Aidan Byrne, CEO, the Australian Research Council

Dear Professor Byrne,

We know you have floated plans to broaden access to scientific publications supported by the ARC. We enthusiastically applaud such moves, as we believe that citizens and businesses should be able to access the science that their tax dollars pay for. We also believe the ARC should go beyond opening access to journal articles and take steps to increase the sharing of scientific data.

As science is currently practised, many research publications are more like press releases or executive summaries than detailed records of what was done and found. They are descriptions of the primary outcome the authors wish to reveal, often with insufficient supporting data for readers to validate the conclusions themselves.

As a result, science is facing a reproducibility crisis. Recent efforts to replicate published findings have yielded a low rate of success (Begley & Ellis, 2012; Prinz, F., Schlange, T. & Asadullah, K., 2011). More details of what was actually done and found when reports are published are needed to remedy this. We think these details should usually include the raw data.

Not only would such open science facilitate replication, but it would also result in further discoveries. For example, many more publications have arisen from people re-analysing Hubble Space Telescope data than from the scientists who originally acquired the data.

The cost of conducting the Human Genome Project is acknowledged to be dwarfed by its economic benefit to scientific productivity because the data are publicly available. For a physically more isolated country like Australia to participate in cutting-edge international science, the free sharing of research data is particularly important.

In the future, data sharing will result in discoveries beyond the capability of any one lab, as data can be mined by computer algorithms to seek trends and patterns that reveal hidden truths. In a recent report entitled Science as an Open Enterprise, the Royal Society wrote that: “Governments should recognise the potential of open data … to enhance the excellence of the science base. They should develop policies for opening up scientific data that complement policies for open government data, and support development of the software tools and skilled personnel that are vital to the success of both.”

It may be only through open science, with massively collaborative efforts, that urgent problems of the world can be solved. Open science can reduce wasteful duplication of efforts (because it is clear what everyone is doing) while maintaining the highest quality work (because everything is continually peer-reviewed in a competitive environment).

We’d like to close with an invitation. We are two members of a group organising a three-day conference in Auckland in early 2013 examining how open research fits the Australian/New Zealand context. We cordially invite you to attend this meeting, and would be delighted if you were able to deliver a keynote outlining your vision for the future practice of science.

Yours sincerely,

Matthew Todd

Alex O. Holcombe

These opinions and those of the other signatories do not necessarily reflect the positions of their institutions.

Academics at ANU, Harvard, Yale, and other universities have joined in signing our letter, together with people from outside the university sector.

See the names of the other signatories here and we hope you will consider adding your name to the bottom of the letter.

Join the conversation

36 Comments sorted by

  1. Craig Savage

    Professor of Theoretical Physics at Australian National University

    Awesome! I think one of the lessons of history is that openness = productivity and creativity.

    If you think research could be more open, how about education? This really goes on behind closed doors.

    report
  2. Charles Driver

    logged in via email @gmail.com

    Alex, Matthew - I am very much for open science, in an abstract, idealistic sense. I don't feel like I am aware enough of the practical realities that such an alteration to the system would be likely to encourage though. I imagine there could be some negatives, at least in the short term, or if other countries did not act similarly? A better understanding of this would do more to convince me to put my name down!

    report
    1. Alex O. Holcombe

      Associate Professor, School of Psychology, and Australian Research Council Future Fellow at University of Sydney

      In reply to Charles Driver

      I think that greater openness of basic research is where science is going, Australia being one of the first movers would actually be a boon, yielding an increase in collaborations and impact. However it will depend on the field, and exceptions must be made for certain topics. In my view the largest short-term downside is the greater amount of work required by some of us to prepare data for sharing. So as the Royal Society wrote, research funders should support development of the software tools that will make sharing research data very easy, such as web-accessible electronic laboratory notebooks.

      report
    2. John Canning

      Professor at University of Sydney

      In reply to Alex O. Holcombe

      Great thought provoking article - it will require a sort of revolution in the field to be able to move away by the implications of open data repositories. As an editor of existing journals it’s obvious there is so much rubbish being put out there and one the major challenges is finding enough reviewers willing to pass judgment on papers these days. The major journals in fact undermine peer review, and science, completely by selecting articles on what the editors think is of contemporary and high…

      Read more
    3. Craig Motbey

      PhD candidate in Psychopharmacology at University of Sydney

      In reply to Alex O. Holcombe

      There may be other issues with animal research, however. Fully public access to animal study notes could result in groups like PETA using out-of-context images in their advertising campaigns and, worse than that, allow the psychotic nutters of groups like NIO to extend their campaigns of harassment and violence.

      The current closed-access scientific publishing system is a parasitic dinosaur that actively retards scientific progress, and the sooner it dies the better. It is entirely within our technical…

      Read more
  3. Christopher Stevenson

    Associate Professor of Epidemiology at Deakin University

    In principle I fully support the idea that data arising from publicly funded research (eg via ARC or NHMRC grants) should be made freely available. However, we need to be careful how we implement this. I work in epidemiology and public health. The major data sources for our work include population surveys and clinical studies which include highly sensitive personal data provided to us on the basis of strict confidentiality. The release of raw data from these studies would be a violation of the study subjects’ trust. There are ways of making data less identifiable, but these also reduce the useability of the data for detailed analysis. Also, it is already difficult to recruit people into clinical trials and population health studies. If people knew that their personal data would be made generally available it would be even more difficult to find people willing to take part in such studies.

    report
    1. Alex O. Holcombe

      Associate Professor, School of Psychology, and Australian Research Council Future Fellow at University of Sydney

      In reply to Christopher Stevenson

      Agreed. To handle such cases, I have heard about the idea (perhaps from Dan Simons?) of escrow accounts for data. With such a system, ethically sensitive data would be held by a third party who would share it to authorised scientists. In the current system, some journals such as those published by PLoS require that (de-identified) data be shared with other researchers, as a condition of publication. However, follow-up studies have revealed that the authors of those published studies, when contacted, rarely share the data even though they are required to. We need some way to guarantee compliance, and a third party (escrow) is one idea. Of course, for many studies such as those on rats, on electronics, and the environment, these ethical issues fortunately do not arise.

      report
    2. Chris Booker

      Research scientist

      In reply to Christopher Stevenson

      These problems of identifiable information are not insurmountable. I would say definitely upon publication submission of raw data should be mandatory. Obviously for human-subjects research this gets trickier with the need to maintain anonymity, but I don't think we should baulk at the idea and not pursue it.

      report
  4. Chris Booker

    Research scientist

    This is fantastic, I'm all for it. I really believe we need a complete overhaul of the way science is published - the current models are simply no longer working (or never were, it's not clear which given, ironically, there's no research on it)

    report
    1. John Canning

      Professor at University of Sydney

      In reply to Chris Booker

      In this open letter to the ARC Director, our colleagues have called for a similar consideration given to open publications also be given to “open science” – a term used to describe the supposed sharing of data from the bench worldwide. To justify their call, they provided some interesting and good examples of where some apparently useful results were obtained and demonstrate the potential for even citizens to participate. Certainly this latter aspect, where it is done with a different motivation…

      Read more
    2. trevor prowse

      retired farmer

      In reply to Chris Booker

      Where science involves a later stage of patent rights , I can not see scientists freely giving out their notes. Where it involves public funding alone , then the question arises as to what stage information is released. I have visited agricultural departments , universities, agricultural research stations , the CSIRO all around Australia . My experience in most cases , wonderful cooperation. In some case great progress was made. I think in nearly all contacts, I did not report the results which probably gave scientists more freedom to discuss their work..In plant breeding the technology involves buying parts of the means to conduct that research. Public research has been reduced because of the cost, so joint research has been the only way public research could continue. This complicates the availability to report or devulge the methods and results. In plant breeding ,the sharing of information between scientists has been productive.

      report
  5. Richard Helmer

    REsearch Engineer

    in principle this is what happens and is indeed what i personally practice most within my limits...the time scale is perhaps curently long and level of filtering high... is this good/bad...not sure?... digital communciations are drastically increasing the system speed, facilitating new collaboration, and so addressing some aspects of the productivity of the system as a whole... in general, once reliable/accepted methods of measurement are achieved 'science' [sic method] starts to become algorithmic…

    Read more
  6. Christopher Stevenson

    Associate Professor of Epidemiology at Deakin University

    Another issue to consider is who would fund the maintenance and distribution of research data. Preparing data for open access and maintaining the resultant database are not trivial or cost-free exercises. How would we fund them? On the one hand, you could make it a user-pays exercise, but then data would only be available to researchers with the spare cash to purchase them. On the other hand, we could insist that ARC/NHMRC properly fund data distribution with each grant they fund. However, as we know, both ARC and NHMRC do not have anywhere near enough money to fund all the worthwhile projects submitted to them. So how many fewer funded research projects would we be willing to have in order to pay for data to be made freely available?

    report
    1. John Canning

      Professor at University of Sydney

      In reply to Christopher Stevenson

      Yes indeed. Open access is upwards of 2K these days in science and 5K in Nature - and growing. Assuming most good researchers published between 3 and 10 papers a year, that is a considerable cost that is simply not covered by existing funding agencies. Repositories of data will no doubt be looking at similar levels.

      report
    2. Alex O. Holcombe

      Associate Professor, School of Psychology, and Australian Research Council Future Fellow at University of Sydney

      In reply to John Canning

      Christopher: fortunately the Australian government has already been providing, through channels separate from the ARC, considerable funding for data infrastructure. I hope someone familiar with those initiatives and resources can comment on that. For my part, most recently we posted our data and code in the university web-accessible erepository, with no fuss. Although certainly the cost is not trivial, that infrastructure is needed anyway for disseminating other things. John: it's true that most…

      Read more
    3. John Canning

      Professor at University of Sydney

      In reply to Alex O. Holcombe

      That is true - in our field the closest one is perhaps ArXiv which is a repository that is in fact free (and the quality is surprisingly good overall - I suspect in part because its not formally recognized by anyone and has no impact factor so it hasn't yet been swamped by the masses!). Depending on your community it is taken seriously and cited or it is ignored. It also has periodic debates on how to sustain itself and is currently supported by one institution so it will need to eventually charge I suspect. It is also not ranked making it vulnerable perhaps in terms of ongoing support. I totally agree the costs should not be as much as they are but for the major impact journals they largely are - not sure about PLoS and where its funding comes from but its a very interesting case indeed especially if its doing well on a sustainable basis.

      report
  7. Kim Flintoff

    logged in via Facebook

    Much of this can already be enabled via the ANDS and NECTAR. The cloud services support for data sharing are already in place. Open Access data should be seen as the future of research, as should crowdsourced expertise and analysis...

    report
    1. Matthew Todd

      School of Chemistry at University of Sydney

      In reply to Kim Flintoff

      Indeed, we're trying to work these resources into what we're doing. NECTAR in particular could be a very useful place for online electronic lab notebooks.

      report
  8. rory robertson

    rory robertson is a Friend of The Conversation.

    former fattie

    Good on you Alex and Matthew, for pushing towards a more transparent and collaborative scientific process. I agree. But it would be good if your University of Sydney could get the basics right first. Like up versus down, the difference between valid and invalid indicators, and the need to correct published errors. In particular, my concern is that two of the University of Sydney's highest-profile scientists have published eye-popping errors - driving a spectacularly false and somewhat dangerous…

    Read more
    1. John Canning

      Professor at University of Sydney

      In reply to rory robertson

      Hi Rory, interesting controversy you've raised - would open science make it worse or better? The main criticism I have of graphs 3 and 4 is that you don't mention they are averages for two or three particular years only. Solely focusing on these, it's always a bit risky to go beyond anything more than a speculative statement about correlation (up or down) based on graphs that have only two or three data points! Perhaps all the years in between its been oscillating wildly in keeping with solar oscillations driving an increased thirst for sugar products and everyone has got it wrong?? How do you want to pay me the 40K???

      report
    2. rory robertson

      rory robertson is a Friend of The Conversation.

      former fattie

      In reply to John Canning

      Thanks for your response, Professor Canning. I'm not sure if "open science" would have made this situation worse or better. What I do know for sure, however, is that simple scrutiny of basic facts shredded the credibility of the highest-profile paper ever written by the University of Sydney's highest-profile nutrition scientists. The Australian Paradox paper, in my opinion, is an academic disgrace.

      Professor Canning, I absolutely agree with you that "...it's always a bit risky to go beyond…

      Read more
    3. John Canning

      Professor at University of Sydney

      In reply to rory robertson

      Hi Rory, I am not a nutrition expert to comment on the article - in general statistical correlations can be a touchy subject when topics are somewhat subjective and you need to understand these. I am certainly not an expert on how nutrition research is undertaken and I am sure you know more about the dangers of correlations having worked at the RBA! I merely made reference from a hard science perspective to the use of the plots since you offered 40K to anyone who could point out an error in your own case! Saying that the data trends upwards is to me as equally invalid as saying it trends downwards - so I reckon you owe me 40K!

      report
    4. rory robertson

      rory robertson is a Friend of The Conversation.

      former fattie

      In reply to John Canning

      G'day again, John. This Australian Paradox dispute is not about nutrition or science, so there's no need to be sheepish about not being an "expert". This is a dispute about simple empirical facts. It is a dispute about what those five sets of data - some valid, some invalid - reveal; those five sets of data sitting there unacknowledged by the authors in the authors' own published charts.

      John, if you were inclined to say, "The quality of the available data is rather poor so we cannot conclude…

      Read more
    5. Sean Lamb

      Science Denier

      In reply to rory robertson

      Mr Robertson, I also read your contributions on retractionwatch, so I will respond to both here.

      You may be right about sugar consumption, you probably are for I know or care. But the scientific literature is packed with papers that reach the wrong conclusions, being wrong is generally not grounds for retraction or anything else. It is an invitation for someone to submit a better study.
      One researcher estimated that 80% of medical studies were wrong (presumably not including his own)
      http…

      Read more
    6. rory robertson

      rory robertson is a Friend of The Conversation.

      former fattie

      In reply to Sean Lamb

      Sean, I'm amazed that you like many others have defended the disputed paper on the basis that heaps of published papers are sloppy with false conclusions. If all the "scientists" behind Australian Paradox had done was write a sloppy academic paper and publish it in an obscure and little-respected pay-as-you-publish E-journal, Sean, I wouldn't have noticed.

      But Australian Paradox is not some obscure factually incorrect study that would not have seen the light of day but for me. In fact, the Heart…

      Read more
    7. Sean Lamb

      Science Denier

      In reply to rory robertson

      Mr Robertson, you make me feel like the adult who finally tells his daughter there is no Santa Claus.

      I don't defend the paper at all. I simply state that being wrong is not grounds for not being published. In fact it is important that papers that are wrong are published, in order that better papers get published.

      Its the old Hegelian dialectic of thesis, antithesis, synthesis.
      Universities across Australia are full of people publishing dodgy research, why get excited about this particular one?

      report
    8. Matthew Todd

      School of Chemistry at University of Sydney

      In reply to Sean Lamb

      I feel I need to call time on this discussion gents. I don't see an end in sight here, but I also think we're straying from the issue of the value of open data. Might you be able to continue on a site dedicated to this particular case?

      report
  9. Sean Lamb

    Science Denier

    Being against open science is a little bit like being against kittens. I do my best to disagree with every proposition thrown up on The Conversation, this is my greatest challenge yet.

    Cooperation is good, but so is competition - and like sausages sometimes the process that end in a published paper are not best not seen if you wish to fully trust the results.
    Having a quick look through the malaria page, to me it does not seem to be the true open science that the letter suggests http://malaria

    Read more
    1. Matthew Todd

      School of Chemistry at University of Sydney

      In reply to Sean Lamb

      Sean - just to answer a couple of things:

      1) Either you were looking at recent entries, or maybe you didn't click on the titles of the experiments in the lab books? The idea is that everything is shared - all the raw data. e.g. http://www.ourexperiment.org/racemic_pzq or http://malaria.ourexperiment.org/tcmdc_ap/3787. If the data are not there, that's an error. There are no paper versions. Summaries are published separately on The Synaptic Leap or Google+. Project summary is updated on a wiki. But the publicly available ELN is the primary data source.

      2) There are two types of competition. One where you know what other people are doing, and one where you don't. I see the first as real competition and the second as a kind of federated monopoly. Open science encourages the first, open innovation is the second. A good example of the power of open competition is the Matlab contests http://www.mathworks.com.au/matlabcentral/contest/contests/3

      report
    2. Sean Lamb

      Science Denier

      In reply to Matthew Todd

      Apologies - my mistake due to my crappy wireless connection, the files simply weren't loading. At work they looked fine. Open science is going to need good bandwidth.

      I think there is a lot of work needed on the actual format. It is not my field so it is difficult for me to judge, but just some of images of the blots or membranes just don't seem to have the same immediacy as pasting them into a lab book. But possibly that needs feedback from the users as to whether there is any tradeoff in…

      Read more
    3. Katrina Badiola

      Student, University of Sydney

      In reply to Sean Lamb

      Sean - just on the ELNs,

      My very first lab notebook was electronic form (http://www.ourexperiment.org/racemic_pzq/ starting around November 2011). I've recently been required to keep a traditional lab notebook as opposed to the ELN. Ease of use of a physical lab notebook? I'm finding keeping a paper book much more tedious than using an ELN for some of the reasons below:

      1) Paper has no search functionality.

      2) I personally find it easier to type than to write.

      3) Linking of experiments…

      Read more
    4. Sean Lamb

      Science Denier

      In reply to Katrina Badiola

      "I'd use an ELN over a traditional lab notebook any day"

      Ah well, then you should have no trouble in persuading most of your fellow students at the University of Sydney to take them up.

      Let us know how you get on.

      report
    5. Chris Booker

      Research scientist

      In reply to Katrina Badiola

      I with you there Katrina - for all the reasons you cite, plus the fact that an electronic file can be backed up. I have known people who have lost their physical lab notebook and been left without records for months of experiments! It's crazy that in this day and age physical books are still more-or-less standard, when there's so much time and money tied up in what's in them. I don't think it's being very responsible as a scientist to use a system of record-keeping which is lacking such safeguards.

      report
  10. Matthew Todd

    School of Chemistry at University of Sydney

    A general point of clarification, in response to some of the comments above which are focussing on open science (a process) rather than the central theme of open data (the theme of the article).

    Alex and I to some extent see the debate about open access as having reached a tipping point (ignoring for a moment how it should be funded, and what flavour it might be). The point we're advocating in the article above is that encouraging the deposition of open data is perhaps the more valuable long-term goal. Open data does not require open science (where data are released prior to publication, interactively). We used open science projects as one example. The more general point is the value of mandating open data associated with standard publications, i.e. that upon publication, the data associated with a project is made freely available - so that the data are a) comprehensive, b) digital and c) free to mine by computer.

    report