tag:theconversation.com,2011:/uk/topics/research-assessment-12792/articlesResearch assessment – The Conversation2023-11-14T19:07:26Ztag:theconversation.com,2011:article/2175412023-11-14T19:07:26Z2023-11-14T19:07:26Z‘You only assess what you care about’: a new report looks at how we assess research in Australia<figure><img src="https://images.theconversation.com/files/559197/original/file-20231113-17-hzq74f.jpg?ixlib=rb-1.1.0&rect=23%2C35%2C7880%2C5214&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">
</span> <span class="attribution"><a class="source" href="https://www.pexels.com/photo/photo-of-female-engineer-looking-through-wires-3862623/">ThisIsEngineering/Pexels</a>, <a class="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA</a></span></figcaption></figure><p>Research plays a pivotal role in society. Through research, we gain new understandings, test theories and make discoveries. </p>
<p>It also has a huge economic value. In 2021, the <a href="https://www.csiro.au/en/work-with-us/services/consultancy-strategic-advice-services/CSIRO-futures/Innovation-Business-Growth/Quantifying-Australias-returns-to-innovation">CSIRO found</a> every A$1 of research and development investment in Australia creates an average of $3.50 in economy-wide benefits. </p>
<p>But how do we know if individual research projects being conducted in Australia are good quality? How is research recognised? The key way this happens is through “research assessment”. </p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/tumult-and-transformation-the-story-of-australian-universities-over-the-past-30-years-215536">Tumult and transformation: the story of Australian universities over the past 30 years</a>
</strong>
</em>
</p>
<hr>
<h2>What is research assessment?</h2>
<p>Research assessment is not a centralised or necessarily formal process. It can involve various processes and measures to evaluate the performance of individual researchers and research institutions. This includes assessing the quality, excellence and impact of various outputs. </p>
<p>Research assessment can be qualitative or quantitative. It can include publications in journals and the number of people who cite the research, gaining grants to do further research, commercialisation, media engagement and impact on decision-making or public policy, prizes and invitations to speak at conferences. </p>
<p>If research assessment is working fairly and effectively, it should achieve several things. This includes: helping to develop researchers’ careers, making sure innovative research does not get avoided in favour of short-term gains and helping funders and the community have confidence research is providing value for money and adding to the public good. </p>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/we-solve-problems-in-30-days-through-research-sprints-other-academics-can-do-this-too-204373">We solve problems in 30 days through 'research sprints': other academics can do this too</a>
</strong>
</em>
</p>
<hr>
<h2>Our project</h2>
<p>Our new project aimed to provide a better understanding of how research assessment affects research in Australia. </p>
<p>In a <a href="https://acola.org/wp-content/uploads/2023/11/ACOLA_ResearchAssessment_FINAL.pdf">report released today</a>, we surveyed more than 1,000 Australian researchers and more than 50 research organisations. </p>
<p>This included universities, research institutes, industry bodies, government and not-for-profit organisations. The majority of researchers (74%) were in academic roles. Across those research sectors, we also conducted 11 roundtables involving around 120 people and 25 intensive interviews to understand the issues.</p>
<p>This work was commissioned by Chief Scientist Cathy Foley and conducted by the Australian Council of Learned Academies (involving the academies of science, medical science, engineering and technological sciences, social sciences and humanities). </p>
<p>It also comes as the <a href="https://www.education.gov.au/australian-universities-accord">Universities Accord review</a> examines how research is funded and approached within higher education. </p>
<figure class="align-center ">
<img alt="A young man searches the shelves of a library." src="https://images.theconversation.com/files/559200/original/file-20231114-21-9oaq62.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/559200/original/file-20231114-21-9oaq62.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/559200/original/file-20231114-21-9oaq62.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/559200/original/file-20231114-21-9oaq62.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/559200/original/file-20231114-21-9oaq62.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/559200/original/file-20231114-21-9oaq62.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/559200/original/file-20231114-21-9oaq62.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Research assessment should help to develop researchers’ careers.</span>
<span class="attribution"><a class="source" href="https://www.pexels.com/photo/male-student-searching-at-book-shelves-6549376/">Tima Miroshnichenko/Pexels</a>, <a class="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA</a></span>
</figcaption>
</figure>
<h2>What we found</h2>
<p>We found some difficulties with the current approach to research assessment. </p>
<p>We heard there is a tendency by some researchers to “play it safe” in terms of doing research they believe will score well. We also heard how the assessment process can unintentionally exclude or devalue particular forms of knowledge, particularly in the humanities and the social sciences, where outputs can be less easily quantified or less immediately seen.</p>
<p>As one interviewee said: </p>
<blockquote>
<p>What is assessed and how it is assessed are an indication of what the
organisation values. You only assess what you care about. Values and
culture drive assessment.</p>
</blockquote>
<p>Our roundtables told us senior staff and supervisors are often seen to reinforce the culture of “publish or perish”, with the number of articles being valued more highly than the quality. </p>
<p>We heard early and mid-career researchers and people from underrepresented backgrounds can have difficulties trying to “play the game” to advance their careers. For example, early-career researchers are often expected to produce work that benefits their larger team, at a cost to their own capacity for promotion. </p>
<p>As one interviewee noted: </p>
<blockquote>
<p>Metrics are essential for defining value and comparative difference, but
Australia requires a modern and fair framework for assessing our current
and next generation of researchers.</p>
</blockquote>
<p><div data-react-class="Tweet" data-react-props="{"tweetId":"1650983340751613952"}"></div></p>
<h2>Survey results</h2>
<p>Our survey found a high level of dissatisfaction with the state of research assessment. This included: </p>
<ul>
<li><p>73% of respondents agreed assessment processes are not consistently or
equitably applied across disciplines, in particular between the humanities and the sciences </p></li>
<li><p>67% said there are not enough opportunities to provide input into research assessment practices</p></li>
<li><p>70% said assessments took up unreasonable time and effort. </p></li>
</ul>
<hr>
<p>
<em>
<strong>
Read more:
<a href="https://theconversation.com/fieldwork-can-be-challenging-for-female-scientists-here-are-5-ways-to-make-it-better-214215">Fieldwork can be challenging for female scientists. Here are 5 ways to make it better</a>
</strong>
</em>
</p>
<hr>
<h2>The way forward</h2>
<p>In our survey, we asked “What is one specific change you would
recommend to improve current research assessment processes?”.</p>
<p>Respondents wanted to see a shift towards quality over quantity. This means not just a focus on publishing as many papers as possible, but supporting research that may take longer for its value and benefits to emerge. </p>
<p>They wanted interdisciplinary research to be promoted and rewarded, because many of the complex problems of our world – from climate change to domestic violence to housing affordability – require multiple disciplines to be involved in finding solutions. In the same vein, they also wanted collaboration and team work to be rewarded more clearly and transparently. </p>
<p>They wanted less bias towards STEM (science, technology, engineering and maths) research and more promotion of diversity and of early-career researchers. This included better understanding of their personal and cultural situation, more focused career development and better managed teamwork.</p>
<p>To achieve all of this, and more, we will also need to understand that no single measure can assess all research or researchers. So, several tools will be needed, including quantitative indicators as well as qualitative measures and peer review.</p>
<hr>
<p><em>Ana Deletic, Louisa Jorm, Duncan Ivison, Robyn Owens, Jill Blackmore, Adrian Barnett, Kate Thomann, Caroline Hughes, Andrew Peele, Guy Boggs and Raffaella Demichelis were all part of the expert working group supporting this work.</em></p><img src="https://counter.theconversation.com/content/217541/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Kevin McConkey has previously received funding from the Australian Research Council. He is the current chair of the Policy Committee of the Academy of the Social Sciences in Australia. He is the chair of the Expert Working Group of the the Australian Council of Learned Academies, which prepared the report referred to in this article.</span></em></p>The project, spanning researchers across science and the humanities, looks at how ‘research assessment’ affects research in Australia.Kevin McConkey, Emeritus Professor , UNSW SydneyLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/710782017-01-12T19:27:23Z2017-01-12T19:27:23ZFive things to consider when designing a policy to measure research impact<figure><img src="https://images.theconversation.com/files/152344/original/image-20170111-29019-qdkzu6.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">What's the best way to measure research impact?</span> <span class="attribution"><span class="source">from www.shutterstock.com</span></span></figcaption></figure><p>This year will see the Australian government <a href="http://www.arc.gov.au/news-media/media-releases/2017-pilot-test-impact-business-engagement-researchers">pilot</a> new ways to measure the impact of university research. </p>
<p>As recommended by the <a href="https://theconversation.com/watt-report-suggests-financial-incentives-for-measuring-research-impact-51815">Watt Review</a>, the Engagement and Impact Assessment will encourage universities to ensure academic research produces wider economic and social benefits. </p>
<p>This fits into the <a href="https://theconversation.com/will-the-national-innovation-and-science-agenda-deliver-australia-a-world-class-national-innovation-system-52081">National Innovation and Science Agenda</a>, in which taxpayer funds are targeted at research that will have a beneficial future impact on society. </p>
<p>Education Minister <a href="https://ministers.education.gov.au/birmingham/2017-pilot-test-impact-business-engagement-researchers">Simon Birmingham said</a> the pilots will test</p>
<blockquote>
<p>“how to measure the value of research against things that mean something, rather than only allocating funding to researchers who spend their time trying to get published in journals”. </p>
</blockquote>
<p>This move to measure the non-academic impact of research introduces many new challenges that were not previously relevant when evaluation focused solely on academic merit. <a href="http://dx.doi.org/10.1080/1360080X.2016.1254429">New research</a> highlights some of the key issues that need to be addressed when deciding how to measure impact.</p>
<h2>1. What should be the object of measurement?</h2>
<p>Research impact evaluations needs to trace out a connection between academic research and “real world” impact beyond the university campus. These connections are enormously diverse and specific to a given context. They are therefore best captured through case studies. </p>
<p>When analysing a case study the main issues are: what counts as impact, and what evidence is needed to prove it? When considering this, Australian policymakers can use recent <a href="http://dx.doi.org/10.1080/10564934.2016.1237703">European examples</a> as a benchmark. </p>
<p>For instance, in the UK’s <a href="http://www.ref.ac.uk/">Research Excellence Framework</a> (REF) – which assesses the quality of academic research – the only impacts that can be counted are those directly flowing from academic research submitted to the same REF exercise. </p>
<p>To confirm the impact, the beneficiaries of research (such as policymakers and practitioners) are required to provide written evidence. This creates a narrow definition of impact because those that cannot be verified, or are not based on submitted research outputs, do not count. </p>
<p>This has been a cause of frustration for some UK researchers, but the high threshold does ensure the impacts are genuine and flow from high quality research.</p>
<h2>2. What should be the timeframe?</h2>
<p>There are unpredictable time lapses between academic work being undertaken and it having impact. Some research may be quickly absorbed and applied, whereas other impacts, particularly those from basic research, can take decades to emerge. </p>
<p>For example, <a href="http://journals.sagepub.com/doi/full/10.1258/jrsm.2011.110180">a study looking at time lags in health research</a> found the time lag from research to practice to be on average 17 years. It should be noted, though, that time lapses vary considerably by discipline.</p>
<p>Only in hindsight can the value of some research be fully appreciated. Research impact assessment exercises therefore need to be set to a particular timeframe. </p>
<p>Here, policymakers can learn from previous trials such as one conducted by <a href="https://www.go8.edu.au/sites/default/files/docs/eia_trial_guidelines_final_mrb.pdf">Australian Technology Network and Group of Eight in 2012</a>. This exercise allowed impacts related to research that occurred during the previous 15 years.</p>
<h2>3. Who should be the assessors?</h2>
<p>It is a long established convention that academic excellence is decided by academic peers. Evaluations of research are typically undertaken by panels of academics. </p>
<p>However, if these evaluations are extended to include non-academic impact, does this mean there is now a need to include the views of end-users of research?
This may mean the voices of people outside of academia need to be involved in the evaluation of academic research. </p>
<p>In the 2014 UK REF, over 250 “<a href="http://www.ref.ac.uk/about/users/">research users</a>” (individuals from the private, public or charitable sectors) were recruited to take part in the evaluation process. However, their involvement was restricted to assessing the impact component of the exercise. </p>
<p>This option is an effective compromise between maintaining the principle of academic peer review of research quality while also including end-users in the assessment of impact.</p>
<h2>4. What about controversial impacts?</h2>
<p>In many instances the impact of academic research on the wider world is a positive one. But there are some impacts that are controversial - such as fracking, genetically modified crops, nanotechnologies in food, and stem cell research - and need to be carefully considered. </p>
<p>Such research may have considerable impact, but in ways that make it difficult to establish a consensus on how scientific progress impacts “the public good”. Research such as this can trigger societal tensions and ethical questions. </p>
<p>This means that impact evaluation needs to also consider non-economic factors, such as: quality of life, environmental change, and public health. Even though it is difficult placing dollar values on these things.</p>
<h2>5. When should impact evaluation occur?</h2>
<p>Impact evaluation can occur at various stages in the research process. For example, a funder may invite research proposals where the submissions are assessed based on their potential to produce an impact in the future. </p>
<p>An example of this is the European Research Council <a href="https://erc.europa.eu/proof-concept">Proof of Concept Grants</a>, where researchers who have already completed an ERC grant can bid for follow-on funding to turn their new knowledge into impacts.</p>
<p>Alternatively, impacts flowing from research can be assessed in a retrospective evaluation. This approach identifies impacts where they already exist and rewards the universities that have achieved them. </p>
<p>An example of this is the <a href="http://www.vsnu.nl/en_GB/sep-eng.html">Standard Evaluation Protocol</a> (SEP) used in the Netherlands, which assesses both the quality of research and its societal relevance.</p>
<p>A novel feature of the proposed Australian system is the assessment of both <a href="https://theconversation.com/when-measuring-research-we-must-remember-that-engagement-and-impact-are-not-the-same-thing-56745">engagement and impact</a>, as two distinctive things. This means there isn’t one international example to simply replicate. </p>
<p>Although Australia can learn from some aspects of evaluation in other counties, the Engagement and Impact Assessment pilot is a necessary stage to trial the proposed model as a whole. </p>
<p>The pilot - which will test the suitability of a wide range of indicators and methods of assessment for both research engagement and impact - means the assessment can be refined before a planned national rollout in 2018.</p><img src="https://counter.theconversation.com/content/71078/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Andrew Gunn receives funding from Worldwide Universities Network, the British Council (administering the Newton Fund), the UK Higher Education Academy, the United Kingdom Political Studies Association, the New Zealand Political Studies Association and the UK Quality Assurance Agency. Andrew Gunn concurrently holds visiting academic positions internationally.
</span></em></p><p class="fine-print"><em><span>Michael Mintrom does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>This move to measure the impact of university research on society introduces many new challenges that were not previously relevant when evaluation focused solely on academic merit.Andrew Gunn, Researcher in Higher Education Policy, University of LeedsMichael Mintrom, Professor of Public Sector Management, Monash UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/658532016-10-05T09:32:18Z2016-10-05T09:32:18ZStern review says little about how REF has affected teaching<figure><img src="https://images.theconversation.com/files/140173/original/image-20161003-20239-1nb5ain.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Ruthless competition among academics could leave many scarred and deflated.</span> <span class="attribution"><span class="source">shutterstock/ArtFamily</span></span></figcaption></figure><p>In the super-inflated market for star footballers, there is one thing a striker cannot do: move his winning goals to his new club. Not so in the almost equally inflated market for “star academics”. Here, researchers can transfer the credit for their publications to a new employer, at least for the purposes of the <a href="http://www.ref.ac.uk/">Research Assessment Exercise</a>, the process of evaluating and ranking all university departments in the UK. </p>
<p>This is just one of the system’s absurdities that the long-awaited <a href="https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/541338/ind-16-9-ref-stern-review.pdf">review by Lord Nicholas Stern</a> wants to put right. The review, which was published over the summer, made a few well aimed recommendations to correct the most obvious anomalies of the Research Assessment Exercise, or REF. The REF is a five or six-yearly evaluation of the research quality of each and every university department in the UK by panels of experts made up mostly of academics themselves. A great deal hangs on these evaluations, including institutional reputations, individual academics’ careers, student recruitment and opportunities for further research funding. </p>
<p>Stern proposed that universities stop deciding which and how many of their research staff to submit for evaluation. Being left out of the REF has long been a source of <a href="https://theconversation.com/stress-put-on-academics-by-the-ref-recognised-in-stern-review-63237">fear and anxiety among academics</a>. This has been tantamount to stating that their research does not make the cut and including them in the REF would drag down the ranking of their department. The implied threat of being left out of the REF has been increased teaching and administrative loads. </p>
<p>The system has also led to <a href="http://www.yiannisgabriel.com/2014/12/ref2014-results-hideous-pecking-order.html?q=ref">grotesque gaming</a> in which universities that include a larger proportion of their staff in the evaluation suffer in the rankings. Declaring only a minority of star researchers generally raises the position of a department in the rankings, which take no notice of those left out of the REF. </p>
<p>Including every academic in the REF would substantially increase the costs of the exercise. And the REF’s <a href="https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/541338/ind-16-9-ref-stern-review.pdf">cost is an obvious concern to Stern</a>, who estimated that the last exercise (2014) cost £246m to evaluate 52,000 academics out of a possible 145,000. At an average of nearly £5,000 to evaluate each submitted researcher – paid by the taxpayer – the REF is not cheap. But Stern wishes to leave this cost unchanged even though evaluating every eligible academic would actually increase the costs. </p>
<p>To forestall this, he proposes a reduction of the number of publications per faculty member to an average of two instead of the current fixed four. This is one of the report’s most significant and under-reported ideas – and as yet, its implications are unclear. Consider, for example, a department with one prolific researcher who submits to six prestigious publications and six less distinguished colleagues who submit one publication each. Would such a department be evaluated on the same basis as a department of six academics each submitting two publications? No one knows. </p>
<h2>Marginal improvements</h2>
<p>The review’s attempt to leave the cost of the REF unchanged is typical of its timidity. It offers some recommendations for marginal improvements rather than a considered assessment of the overall effects of the REF. Though this timidity is surprising – given the boldness of Stern’s earlier <a href="http://www.wwf.se/source.php/1169157/Stern%20Report_Exec%20Summary.pdf">report on the economics of climate change</a> – it is not a total surprise, given that the membership of his steering group included many of the greats and the good in higher education but no union, professional or student representation. This addition might have led to a better recognition of the hidden costs, especially the time taken away from teaching and pastoral care of students and the emotional costs of academics’ single-minded obsession with publications. </p>
<figure class="align-center ">
<img alt="" src="https://images.theconversation.com/files/140180/original/image-20161003-20196-1hj1pjv.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/140180/original/image-20161003-20196-1hj1pjv.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/140180/original/image-20161003-20196-1hj1pjv.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/140180/original/image-20161003-20196-1hj1pjv.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/140180/original/image-20161003-20196-1hj1pjv.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/140180/original/image-20161003-20196-1hj1pjv.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/140180/original/image-20161003-20196-1hj1pjv.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Another day in the lab.</span>
<span class="attribution"><span class="source">Shutterstock</span></span>
</figcaption>
</figure>
<p>What the review refuses to notice is the extent to which the REF has turned academic research from a vocation to pursue knowledge and scholarship into a tyrannical game of “hits” in “top journals”. This has contributed to a massive growth in the numbers of research journals, with about 250 new ones starting every year. The number of published articles has also <a href="http://ocs.library.utoronto.ca/index.php/Elpub/2008/paper/view/689/0">ballooned to over a million a year</a>. Yet most of them languish unread <a href="http://www.sciencedirect.com/science/article/pii/S1751157714000959">and uncited</a>.</p>
<h2>Ruthless competition</h2>
<p>This overproduction of research papers – most of them meaningful only to tiny academic tribes – is costly well beyond the costs of the REF. It reduces teaching to a second-class activity compared to “research”. It prevents academics from dedicating more time and care to their students – and it stops them from reading works which have something original and meaningful to say. </p>
<p>It also increases the costs of higher education which are ultimately paid by taxpayers, students and their parents. It promotes ruthless competition among academics which leaves many of them <a href="http://www.yiannisgabriel.com/2015/02/the-times-higher-education-best.html">scarred and deflated</a>. On top of that, it fosters the growth of adjunct staff on low salaries and precarious work conditions. It exacerbates inequalities between star researchers and their “ordinary” peers – <a href="https://theconversation.com/the-summer-when-working-in-a-british-university-lost-its-global-appeal-63431">depressing the earnings of the latter</a> in order to boost those of the former. And it breeds endless self-promotion and hype among universities and their departments.</p>
<figure class="align-center ">
<img alt="" src="https://images.theconversation.com/files/140177/original/image-20161003-20223-wrl50.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/140177/original/image-20161003-20223-wrl50.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=400&fit=crop&dpr=1 600w, https://images.theconversation.com/files/140177/original/image-20161003-20223-wrl50.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=400&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/140177/original/image-20161003-20223-wrl50.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=400&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/140177/original/image-20161003-20223-wrl50.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=503&fit=crop&dpr=1 754w, https://images.theconversation.com/files/140177/original/image-20161003-20223-wrl50.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=503&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/140177/original/image-20161003-20223-wrl50.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=503&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px">
<figcaption>
<span class="caption">Academics have less time to care for their students.</span>
<span class="attribution"><span class="source">Shutterstock</span></span>
</figcaption>
</figure>
<p>On the burning issue of how the REF has affected teaching, pastoral care of students, academic citizenship and general scholarship, Stern’s review remains largely silent, happy with some pious platitudes on the need to harmonise the REF with the forthcoming <a href="http://www.hefce.ac.uk/lt/tef/">Teaching Excellence Framework</a>, destined to become its equally dysfunctional twin. And in doing so, Stern colludes with a state of affairs, exacerbated by the REF, that harbours grave troubles for institutions, students, graduates and society at large.</p>
<p>Higher education may not engender the same dangers as climate change but the current system is unsustainable. It is an expensive and wasteful system with few winners and many losers. It is a system the deters many bright graduates from pursuing academic careers. It is a costly system that puts the survival of numerous university departments and potentially entire institutions at risk. And it must be said that the quality of teaching offered to students is at best uneven – as is the value of their qualifications. </p>
<p>Ultimately, all of this calls for a major rethink of how university research is conducted and rewarded. Unfortunately rather than the mass overhaul required, Stern seems satisfied with merely tampering at the edges.</p><img src="https://counter.theconversation.com/content/65853/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Yiannis Gabriel works for the University of Bath and the University of Lund.</span></em></p>The REF has turned academic research from a vocation to pursue knowledge and scholarship, to a game of ‘hits’ in top journals.Yiannis Gabriel, Professor of Organizational Theory, University of BathLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/656192016-09-21T00:01:46Z2016-09-21T00:01:46ZWhy isn’t science better? Look at career incentives<figure><img src="https://images.theconversation.com/files/138450/original/image-20160920-11131-1alomb3.jpg?ixlib=rb-1.1.0&rect=49%2C65%2C5289%2C3660&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Experiment design affects the quality of the results.</span> <span class="attribution"><a class="source" href="https://www.flickr.com/photos/iaea_imagebank/8147632150">IAEA Seibersdorf Historical Images</a>, <a class="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA</a></span></figcaption></figure><p>There are often substantial gaps between the idealized and actual versions of those people whose work involves providing a social good. Government officials are supposed to work for their constituents. Journalists are supposed to provide unbiased reporting and penetrating analysis. And scientists are supposed to relentlessly probe the fabric of reality with the most rigorous and skeptical of methods. </p>
<p>All too often, however, what should be just isn’t so. In a number of scientific fields, <a href="https://www.washingtonpost.com/news/speaking-of-science/wp/2015/08/28/no-sciences-reproducibility-problem-is-not-limited-to-psychology/">published findings turn out not to replicate</a>, or to have smaller effects than, what was initially purported. Plenty of science does replicate – meaning the experiments turn out the same way when you repeat them – but the amount that doesn’t is too much for comfort.</p>
<p>Much of science is about identifying relationships between variables. For example, how might certain genes increase the risk of acquiring certain diseases, or how might certain parenting styles influence children’s emotional development? To our disappointment, there are no tests that allow us to perfectly sort true associations from spurious ones. Sometimes we get it wrong, even with the most rigorous methods.</p>
<p>But there are also ways in which scientists increase their chances of getting it wrong. Running studies with small samples, mining data for correlations and forming hypotheses to fit an experiment’s results after the fact are <a href="http://fivethirtyeight.com/features/science-isnt-broken/">just some of the ways</a> to <a href="http://doi.org/10.1038/526182a">increase the number of false discoveries</a>. </p>
<p>It’s not like we don’t know how to do better. Scientists who study scientific methods have known about <a href="http://doi.org/10.1086/288135">feasible remedies for decades</a>. Unfortunately, their advice often falls on deaf ears. Why? Why aren’t scientific methods better than they are? In a word: incentives. But perhaps not in the way you think. </p>
<h2>Incentives for ‘good’ behavior</h2>
<p>In the 1970s, <a href="https://en.wikipedia.org/wiki/Campbell%27s_law">psychologists</a> and <a href="https://en.wikipedia.org/wiki/Goodhart%27s_law">economists</a> began to point out the danger in relying on quantitative measures for social decision-making. For example, when public schools are evaluated by students’ performance on standardized tests, teachers respond by teaching “to the test” – at the expense of broader material more important for critical thinking. In turn, the test serves largely as a measure of how well the school can prepare students for the test.</p>
<p>We can see this principle – often summarized as “when a measure becomes a target, it ceases to be a good measure” – playing out in the realm of research. Science is a competitive enterprise. There are <a href="http://doi.org/10.1038/520144a">far more credentialed scholars and researchers</a> than there are university professorships or comparably prestigious research positions. Once someone acquires a research position, there is additional competition for tenure, grant funding, and support and placement for graduate students. Due to this competition for resources, scientists must be evaluated and compared. How do you tell if someone is a good scientist?</p>
<p>An oft-used metric is the number of publications one has in peer-reviewed journals, as well as the status of those journals (along with related metrics, such as the <a href="https://en.wikipedia.org/wiki/H-index"><em>h</em>-index</a>, which purports to measure the rate at which a researcher’s work is cited by others). Metrics like these make it straightforward to compare researchers whose work may otherwise be quite different. Unfortunately, this also makes these numbers susceptible to exploitation. </p>
<p>If scientists are motivated to publish often and in high-impact journals, we might expect them to actively try to game the system. And certainly, some do – as seen in recent high-profile cases of scientific fraud (including in <a href="https://en.wikipedia.org/wiki/Sch%C3%B6n_scandal">physics</a>, <a href="http://www.nytimes.com/2013/04/28/magazine/diederik-stapels-audacious-academic-fraud.html">social psychology</a> and <a href="http://onlinelibrary.wiley.com/doi/10.1111/bcp.12992/full">clinical pharmacology</a>). If malicious fraud is the prime concern, then perhaps the solution is simply heightened vigilance.</p>
<p>However, most scientists are, I believe, genuinely interested in learning about the world, and honest. The problem with incentives is they can shape cultural norms without any intention on the part of individuals. </p>
<h2>Cultural evolution of scientific practices</h2>
<figure class="align-right zoomable">
<a href="https://images.theconversation.com/files/138454/original/image-20160920-11090-684nc6.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/138454/original/image-20160920-11090-684nc6.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=237&fit=clip" srcset="https://images.theconversation.com/files/138454/original/image-20160920-11090-684nc6.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=784&fit=crop&dpr=1 600w, https://images.theconversation.com/files/138454/original/image-20160920-11090-684nc6.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=784&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/138454/original/image-20160920-11090-684nc6.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=784&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/138454/original/image-20160920-11090-684nc6.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=986&fit=crop&dpr=1 754w, https://images.theconversation.com/files/138454/original/image-20160920-11090-684nc6.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=986&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/138454/original/image-20160920-11090-684nc6.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=986&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Scientists work within a culture of research.</span>
<span class="attribution"><a class="source" href="https://www.flickr.com/photos/iaea_imagebank/8199500456">IAEA</a>, <a class="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA</a></span>
</figcaption>
</figure>
<p>In a <a href="http://rsos.royalsocietypublishing.org/lookup/doi/10.1098/rsos.160384">recent paper</a>, anthropologist <a href="http://xcelab.net/rm/">Richard McElreath</a> and I considered the incentives in science through the lens of <a href="http://www.oxfordbibliographies.com/view/document/obo-9780199766567/obo-9780199766567-0038.xml">cultural evolution</a>, an emerging field that draws on ideas and models from evolutionary biology, epidemiology, psychology and the social sciences to understand cultural organization and change.</p>
<p>In our analysis, we assumed that methods associated with greater success in academic careers will, all else equal, tend to spread. The spread of more successful methods requires no conscious evaluation of how scientists do or do not “game the system.” </p>
<p>Recall that publications, particularly in high-impact journals, are the currency used to evaluate decisions related to hiring, promotions and funding. Studies that show large and surprising associations tend to be favored for publication in top journals, while small, unsurprising or complicated results are more difficult to publish.</p>
<p>But <a href="http://dx.doi.org/10.1371/journal.pmed.0020124">most hypotheses are probably wrong</a>, and performing rigorous tests of novel hypotheses (as well as coming up with good hypotheses in the first place) takes time and effort. Methods that boost false positives (incorrectly identifying a relationship where none exists) and overestimate effect sizes will, on average, allow their users to publish more often. In other words, when novel results are incentivized, methods that produce them – by whatever means – at the fastest pace will become implicitly or explicitly encouraged.</p>
<p>Over time, those shoddy methods will become associated with success, and they will tend to spread. The argument can extend beyond norms of questionable research practices to norms of misunderstanding, if those misunderstandings lead to success. For example, despite over a century of common usage, the <em>p</em>-value, a standard measure of statistical significance, is still <a href="http://dx.doi.org/10.1080/00031305.2016.1154108">widely misunderstood</a>.</p>
<p>The cultural evolution of shoddy science in response to publication incentives requires no conscious strategizing, cheating or loafing on the part of individual researchers. There will always be researchers committed to rigorous methods and scientific integrity. But as long as institutional incentives reward positive, novel results at the expense of rigor, the rate of bad science, on average, will increase. </p>
<h2>Simulating scientists and their incentives</h2>
<p>There is ample evidence suggesting that publication incentives have been negatively shaping scientific research for decades. The frequency of the words <a href="http://dx.doi.org/10.1136/bmj.h6467">“innovative,” “groundbreaking” and “novel”</a> in biomedical abstracts increased by 2,500 percent or more over the past 40 years. Moreover, researchers often <a href="http://dx.doi.org/10.1126/science.1255484">don’t report when hypotheses fail to generate positive results</a>, lest reporting such failures hinders publication.</p>
<figure class="align-left zoomable">
<a href="https://images.theconversation.com/files/138455/original/image-20160920-11127-ntmb9h.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/138455/original/image-20160920-11127-ntmb9h.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=237&fit=clip" srcset="https://images.theconversation.com/files/138455/original/image-20160920-11127-ntmb9h.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=736&fit=crop&dpr=1 600w, https://images.theconversation.com/files/138455/original/image-20160920-11127-ntmb9h.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=736&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/138455/original/image-20160920-11127-ntmb9h.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=736&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/138455/original/image-20160920-11127-ntmb9h.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=925&fit=crop&dpr=1 754w, https://images.theconversation.com/files/138455/original/image-20160920-11127-ntmb9h.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=925&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/138455/original/image-20160920-11127-ntmb9h.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=925&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">There doesn’t need to be anything nefarious going on for scientists to stick with the suboptimal methods that help them get ahead.</span>
<span class="attribution"><a class="source" href="https://www.flickr.com/photos/iaea_imagebank/8198415199">IAEA</a>, <a class="license" href="http://creativecommons.org/licenses/by-sa/4.0/">CC BY-SA</a></span>
</figcaption>
</figure>
<p>We reviewed <a href="http://www.statisticsdonewrong.com/power.html">statistical power</a> in the social and behavioral science literature. Statistical power is a quantitative measurement of a research design’s ability to identify a true association when present. The simplest way to increase statistical power is to increase one’s sample size – which also lengthens the time needed to collect data. Beginning in the 1960s, there have been <a href="http://datacolada.org/wp-content/uploads/2013/10/3416-Sedlmeier-Gigerenzer-Psych-Bull-1989-Do-studies-of-statistical-power-have-an-effect-on-the-power-of-studies.pdf">repeated outcries that statistical power is far too low</a>. Nevertheless, we found that statistical power, on average, <a href="http://rsos.royalsocietypublishing.org/lookup/doi/10.1098/rsos.160384">has not increased</a>.</p>
<p>The evidence is suggestive, but it is not conclusive. To more systematically demonstrate the logic of our argument, we built a computer model in which a population of research labs studied hypotheses, only some of which were true, and attempted to publish their results.</p>
<p>As part of our analysis, we assumed that each lab exerted a characteristic level of “effort.” Increasing effort lowered the rate of false positives, and also lengthened the time between results. As in reality, we assumed that novel positive results were easier to publish than negative results. All of our simulated labs were totally honest: they never cheated. However, labs that published more were more likely to have their methods “reproduced” in new labs – just as they would be in reality as students and postdocs leave successful labs where they trained and set up their own labs. We then allowed the population to evolve.</p>
<p>The result: Over time, effort decreased to its minimum value, and the rate of false discoveries skyrocketed. </p>
<p>And replication – while a crucial tool for generating robust scientific theories – isn’t going to be science’s savior. Our simulations indicate that more replication won’t stem the evolution of bad science.</p>
<h2>Taking on the system</h2>
<p>The bottom-line message from all this is that it’s not sufficient to impose high ethical standards (assuming that were possible), nor to make sure all scientists are informed about best practices (though spreading awareness is certainly one of our goals). A culture of bad science can evolve as a result of institutional incentives that prioritize simple quantitative metrics as measures of success. </p>
<p>There are indications that the situation is improving. Journals, organizations, and universities are increasingly emphasizing <a href="http://www.psychologicalscience.org/index.php/replication">replication</a>, <a href="https://royalsociety.org/journals/ethics-policies/data-sharing-mining/">open data</a>, <a href="http://blogs.plos.org/everyone/2015/02/25/positively-negative-new-plos-one-collection-focusing-negative-null-inconclusive-results/">the publication of negative results</a> and more <a href="https://www.idrc.ca/sites/default/files/sp/Documents%20EN/Research-Quality-Plus-A-Holistic-Approach-to-Evaluating-Research.pdf">holistic evaluations</a>. Internet applications such as <a href="https://twitter.com/lakens/status/774953862012755968">Twitter</a> and <a href="https://www.youtube.com/watch?v=WFv2vS8ESkk&list=PLDcUM9US4XdMdZOhJWJJD4mDBMnbTWw_z">YouTube</a> allow education about best practices to propagate widely, along with spreading norms of holism and integrity. </p>
<p>There are also signs that the old ways are far from dead. For example, one regularly hears researchers discussed in terms of how much or where they publish. The good news is that as long as there are smart, interesting people doing science, there will always be some good science. And from where I sit, there is still quite a bit of it.</p><img src="https://counter.theconversation.com/content/65619/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Paul Smaldino does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>Embracing more rigorous scientific methods would mean getting science right more often than we currently do. But the way we value and reward scientists makes this a challenge.Paul Smaldino, Assistant Professor of Cognitive and Information Sciences, University of California, MercedLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/451492015-07-28T10:22:13Z2015-07-28T10:22:13ZHalf of biomedical research studies don’t stand up to scrutiny – and what we need to do about that<figure><img src="https://images.theconversation.com/files/89833/original/image-20150727-7646-x6278p.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">How much of the research in these journals could be reproduced?</span> <span class="attribution"><a class="source" href="https://www.flickr.com/photos/yeaki/6961051384">Tobias von der Haar</a>, <a class="license" href="http://creativecommons.org/licenses/by/4.0/">CC BY</a></span></figcaption></figure><p>What if I told you that half of the studies published in scientific journals today – the ones upon which news coverage of medical advances is often based – won’t hold up under scrutiny? You might say I had gone mad. No one would ever tolerate that kind of waste in a field as important – and expensive, to the tune of roughly <a href="http://officeofbudget.od.nih.gov/pdfs/FY15/FY2015_Overview.pdf">US$30 billion in federal spending per year</a> – as biomedical research, right? After all, this is the crucial work that hunts for explanations for diseases so they can better be treated or even cured.</p>
<p>Wrong. The rate of what is referred to as “irreproducible research” – more on what that means in a moment – exceeds 50%, <a href="http://dx.doi.org/10.1371/journal.pbio.1002165">according to a recent paper</a>. Some <a href="http://dx.doi.org/10.1371/journal.pmed.0020124">estimates are even higher</a>. In one analysis, just <a href="http://dx.doi.org/10.1038/483531a">11% of preclinical cancer research studies could be confirmed</a>. That means that an awful lot of “promising” results aren’t very promising at all, and that a lot of researchers who could be solving critical problems based on previously published work end up just spinning their wheels.</p>
<p>So what gives? And how can we fix this problem?</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/89835/original/image-20150727-7662-j5cbjp.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/89835/original/image-20150727-7662-j5cbjp.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/89835/original/image-20150727-7662-j5cbjp.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=377&fit=crop&dpr=1 600w, https://images.theconversation.com/files/89835/original/image-20150727-7662-j5cbjp.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=377&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/89835/original/image-20150727-7662-j5cbjp.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=377&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/89835/original/image-20150727-7662-j5cbjp.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=473&fit=crop&dpr=1 754w, https://images.theconversation.com/files/89835/original/image-20150727-7662-j5cbjp.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=473&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/89835/original/image-20150727-7662-j5cbjp.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=473&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Hmmm, I didn’t expect those results….</span>
<span class="attribution"><a class="source" href="http://www.shutterstock.com/pic-64872892/stock-photo-chemistry-recipient-with-ink-color-inside.html">Test tubes image via www.shutterstock.com</a></span>
</figcaption>
</figure>
<h2>What worms tell us about reproducibility</h2>
<p>Although definitions of reproducibility and replication vary somewhat, for a study to be reproducible, another researcher needs to be able to replicate it, meaning use the same data and analysis to come to the same conclusions. There are lots of reasons why a study may not pass the replication test, from flat-out errors to a failure to adequately describe the methodology used. A researcher may have forgotten about a step in the process when he wrote up the methodology, for example, counted data in the wrong category, or written the wrong code for her statistics program.</p>
<p><a href="https://theconversation.com/clearing-the-air-why-more-retractions-are-good-for-science-6008">Faking results</a> is another reason, but it’s not nearly as common as others. Out-and-out fraud like that, or suspected fraud, is the reason for a bit <a href="http://dx.doi.org/10.1073/pnas.1212247109">fewer than half of the 400-plus retractions per year</a>. But there are something like two million papers published annually, so the vast majority of studies containing irreproducible data are never retracted. And most scientists would agree that they shouldn’t be; after all, most science is overturned one way or another over time. Retraction should be reserved for the most severe cases. That doesn’t mean irreproducible papers shouldn’t be somehow marked, though.</p>
<figure class="align-right zoomable">
<a href="https://images.theconversation.com/files/89838/original/image-20150727-7646-utfyxu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/89838/original/image-20150727-7646-utfyxu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=237&fit=clip" srcset="https://images.theconversation.com/files/89838/original/image-20150727-7646-utfyxu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=450&fit=crop&dpr=1 600w, https://images.theconversation.com/files/89838/original/image-20150727-7646-utfyxu.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=450&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/89838/original/image-20150727-7646-utfyxu.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=450&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/89838/original/image-20150727-7646-utfyxu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=566&fit=crop&dpr=1 754w, https://images.theconversation.com/files/89838/original/image-20150727-7646-utfyxu.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=566&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/89838/original/image-20150727-7646-utfyxu.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=566&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">A girl takes her deworming tablet.</span>
<span class="attribution"><a class="source" href="https://www.flickr.com/photos/savethechildrenusa/7051746493">Save the Children</a>, <a class="license" href="http://creativecommons.org/licenses/by-nc-nd/4.0/">CC BY-NC-ND</a></span>
</figcaption>
</figure>
<p>Here’s a fresh example of a study that turned out not to be reproducible, because the results couldn’t be replicated: as <a href="http://www.buzzfeed.com/bengoldacre/deworming-trials">Ben Goldacre relates in BuzzFeed</a>, two economists published a <a href="http://dx.doi.org/10.1111/j.1468-0262.2004.00481.x">massive study in 2004</a> claiming that a “deworm everyone” approach in Kenya “improved children’s health, school performance, and school attendance,” even among children several miles away who didn’t get deworming pills. <a href="http://www.who.int/elena/titles/deworming/en/">Endorsed by the World Health Organization</a>, it helped set policy that affects hundreds of millions of children annually in the developing world.</p>
<p>But now researchers have published <a href="http://dx.doi.org/10.1093/ije/dyv127">papers</a> describing two <a href="http://dx.doi.org/10.1093/ije/dyv128">failures</a> to replicate the original findings. Many of them just didn’t hold up, although some did.</p>
<p>That, as Goldacre explains, “is definitely problematic.” But the reanalyses were possible only because the original authors “had the decency, generosity, strength of character, and intellectual confidence to let someone else peer under the bonnet” – a <a href="http://dx.doi.org/10.1001/jama.2014.9646">rare situation indeed</a>.</p>
<h2>The fixes</h2>
<p>Researchers are aware of the reproducibility problem, and some are trying to fix it. In response to alarming findings about the reproducibility of <a href="http://dx.doi.org/10.1038/483531a">basic cancer research</a>, a program called the <a href="http://validation.scienceexchange.com/#/reproducibility-initiative">Reproducibility Initiative</a> has started providing “both a mechanism for scientists to independently replicate their findings and a reward for doing so.” It’s <a href="http://blog.scienceexchange.com/2012/08/the-reproducibility-initiative/">chosen 50 studies for independent validation</a> – or not, since there’s certainly a chance the initial results won’t be reproducible. Those working on the project will perform the same kind of analyses that researchers did in the worm study replications. A similar effort has been <a href="https://osf.io/ezcuj/wiki/home/">ongoing in psychology</a>, and other projects are under way in the <a href="http://www.bitss.org/">social sciences</a>.</p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/89836/original/image-20150727-7668-esb8u0.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/89836/original/image-20150727-7668-esb8u0.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/89836/original/image-20150727-7668-esb8u0.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=335&fit=crop&dpr=1 600w, https://images.theconversation.com/files/89836/original/image-20150727-7668-esb8u0.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=335&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/89836/original/image-20150727-7668-esb8u0.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=335&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/89836/original/image-20150727-7668-esb8u0.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=421&fit=crop&dpr=1 754w, https://images.theconversation.com/files/89836/original/image-20150727-7668-esb8u0.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=421&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/89836/original/image-20150727-7668-esb8u0.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=421&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption">Research data need to be an open book.</span>
<span class="attribution"><a class="source" href="https://www.flickr.com/photos/brenda-starr/5813347420">Brenda Clarke</a>, <a class="license" href="http://creativecommons.org/licenses/by/4.0/">CC BY</a></span>
</figcaption>
</figure>
<p>All of these efforts will require scientists to share data, as the authors of the deworming study did. That has been a requirement in human studies for some years now, by <a href="http://www.nhlbi.nih.gov/research/funding/human-subjects/data-sharing">many funders</a>, and it’s <a href="http://www.icmje.org/recommendations/browse/publishing-and-editorial-issues/clinical-trial-registration.html">encouraged by many journal editors</a>. And while it’s not met 100% of the time, <a href="http://dx.doi.org/10.1056/NEJMsa1409364">compliance is growing</a>. Some basic science journals are <a href="http://www.nytimes.com/2015/06/26/science/journal-science-releases-guidelines-for-publishing-scientific-studies.html?_r=0">moving to make it a requirement</a>, too.</p>
<p>Perhaps more important, however, is that researchers – and the public that funds many of them – realize that science is a process, and that all knowledge is provisional. “It’s not just naive to expect that all research will be perfectly free from errors,” writes Goldacre, “it’s actively harmful.” <a href="http://www.vox.com/2015/3/23/8264355/research-study-hype">Journalists, take note</a>.</p>
<p>Translated into policy, that means valuing replication efforts, which right now are essentially unfunded and hardly ever published. If we want scientists to validate others’ work, we’ll need to create grants to do that. That means digging up additional funding, but replicating a study costs a tiny fraction of what the original work does. Funding new studies based on those that turn out to be irreproducible…well, now that’s expensive.</p><img src="https://counter.theconversation.com/content/45149/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Ivan Oransky, global editorial director of MedPage Today, is co-founder of Retraction Watch. Retraction Watch, through its parent organization, The Center For Scientific Integrity, is funded by a generous grant from the John D. and Catherine T. MacArthur Foundation.</span></em></p>It’s a problem when much of what winds up in scientific journals isn’t replicable, for various reasons. The research community is taking baby steps toward addressing the “reproducibility crisis.”Ivan Oransky, Distinguished Writer In Residence, Arthur Carter Journalism Institute, New York UniversityLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/368952015-02-19T19:35:53Z2015-02-19T19:35:53ZExplainer: how and why is research assessed?<figure><img src="https://images.theconversation.com/files/70794/original/image-20150202-13057-1ex8dnu.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">Citations, bibliometrics, "publish or perish": why must we constantly assess research?</span> <span class="attribution"><span class="source">Shutterstock</span></span></figcaption></figure><p>Governments and taxpayers deserve to know that their money is being spent on something <a href="http://www.tandfonline.com/doi/full/10.1080/19338244.2015.982002#abstract">worthwhile to society</a>. Individuals and groups who are making the greatest contribution to science and to the community <a href="http://link.springer.com/article/10.1007/BF02019306">deserve to be recognised</a>. For these reasons, all research has to be assessed.</p>
<p>Judging the importance of research is often done by looking at the number of <a href="http://onlinelibrary.wiley.com/doi/10.1002/hfm.20165/abstract">citations</a> a piece of research receives after it has been published.</p>
<p>Let’s say Researcher A figures out something important (such as how to cure a disease). He or she then publishes this information in a scientific journal, which Researcher B reads. Researcher B then does their own experiments and writes up the results in a scientific journal, which refers to the original work of Researcher A. Researcher B has now <a href="http://link.springer.com/article/10.1007%2Fs11192-012-0685-x">cited</a> Researcher A.</p>
<p>Thousands of experiments are conducted around the world each year, but not all of the results are useful. In fact, a lot of scientific research that governments pay for is often ignored after it’s published. For example, of the 38 million scientific articles published between 1900 and 2005, <a href="http://jama.jamanetwork.com/article.aspx?articleid=202114">half were not cited at all</a>.</p>
<p>To ensure the research they are paying for is of use, governments need a way to decide which researchers and topics they should continue to support. Any system should be fair and, ideally, all researchers should be scored using the same measure. </p>
<p>This is why the field of <a href="http://onlinelibrary.wiley.com/doi/10.15252/embr.201439608/full">bibliometrics</a> has become so important in recent years. Bibliometric analysis helps governments to number and rank researchers, making them easier to compare.</p>
<p>Let’s say the disease that Researcher A studies is pretty common, such as cancer, which means that many people are looking at ways to cure it. In the mix now there would be Researchers C, D and E, all publishing their own work on cancer. Governments take notice if, for example, ten people cite the work of Researcher A and only two cite the work of Researcher C.</p>
<p>If everyone in the world who works in the same field as Researcher A gets their research cited on average (say) twice each time they publish, then the international citation benchmark for that topic (in bibliometrics) would be two. The work of Researcher A, with his or her citation rate of ten (five times higher than the world average), is now going to get noticed.</p>
<h2>Excellence for Research in Australia</h2>
<p>Bibliometric analysis and citation benchmarks form a key part of how research is assessed in Australia. The Excellence for Research in Australia (<a href="http://www.arc.gov.au/era/">ERA</a>) process evaluates the quality of research being undertaken at Australian universities against national and international benchmarks. It is administered by the Australian Research Council (<a href="http://www.arc.gov.au/about_arc/default.htm">ARC</a>) and helps the government decide what research is important and what should continue to receive support.</p>
<p>Although these are not the only components assessed in the ERA process, bibliometric data and citation analysis <a href="http://www.arc.gov.au/pdf/ERA15/ERA%202015%20Submission%20Guidelines.pdf">are still a big part</a> of the performance scores that universities and institutions receive.</p>
<p>Many other countries apply formal research assessment systems to universities and have done so for many years. The United Kingdom, for example, operated a process known as the <a href="http://www.rareview.ac.uk/reports/roberts.asp">Research Assessment Exercise</a> between 1986 and 2001. This was superseded by the <a href="http://www.ref.ac.uk/">Research Excellence Framework</a> in 2014.</p>
<p>A bibliometrics-based performance model has also been <a href="http://www.palgrave-journals.com/eps/journal/v8/n3/abs/eps200919a.html">employed in Norway</a> since 2002. This model was first used to influence budget allocations in 2006, based on scientific publications from the previous year.</p>
<p>Although many articles don’t end up getting cited, this doesn’t always mean the research itself didn’t matter. Take, for example, the polio vaccine developed by Albert Sabin last century, <a href="https://www.jstage.jst.go.jp/article/kurumemedj/52/3/52_3_111/_article">which saves over 300,000 lives</a> around the world each year.</p>
<p>Sabin and others <a href="http://jama.jamanetwork.com/article.aspx?articleid=329147">published the main findings</a> in 1960 in what has now become one of the most important scientific articles of all time. By the late 1980s, however, Sabin’s article <a href="http://jama.jamanetwork.com/article.aspx?articleid=363835">had not even been cited 100 times</a>.</p>
<p>On the other hand, we have Oliver Lowry, who in 1951 published an <a href="http://www.jbc.org/content/193/1/265.citation">article describing</a> a new method for measuring the amount of protein in solutions,. This has become the most <a href="http://www.jbc.org/content/280/28/e25.short">highly cited article of all time</a> (over 300,000 citations and counting). Even Lowry was surprised by its “success”, <a href="http://www.annualreviews.org/doi/abs/10.1146/annurev.bi.59.070190.000245">pointing out</a> that he wasn’t really a genius and that this study was by no means his best work.</p>
<h2>The history of research assessment</h2>
<p>While some may regard the assessment of research as a modern phenomenon inspired by a new generation of faceless bean-counters, the concept has been around for centuries.</p>
<p><a href="http://en.wikipedia.org/wiki/Francis_Galton">Sir Francis Galton</a>, a celebrated geneticist and statistician, was probably the first well-known person to examine the performance of individual scientists, publishing a landmark book, <a href="http://galton.org/books/men-science/">English Men of Science</a>, in the 1870s.</p>
<p>Galton’s work evidently inspired others, with an American book, <a href="http://books.google.com.au/books/about/American_Men_of_Science.html?id=IZ9LAAAAMAAJ&redir_esc=y">American Men of Science</a>, appearing in the early 1900s.</p>
<p>Productivity rates for scientists and academics (precursors to today’s performance benchmarks and KPIs) have also existed in one form or another for many years. One of the first performance “benchmarks” appeared in a 1940s book, <a href="http://books.google.com.au/books/about/The_Academic_Man.html?id=CA1CGvPJGtwC&redir_esc=y">The Academic Man</a>, which described the output of American academics.</p>
<p>This book is probably most famous for coining the phrase “publish or perish” - the belief that an academic’s fate is doomed if they don’t get their research published. It’s a fate that bibliometric analysis and other citation benchmarks now reinforce.</p><img src="https://counter.theconversation.com/content/36895/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Derek R. Smith does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>Governments and taxpayers deserve to know that their money is being spent on something worthwhile to society. Individuals and groups who are making the greatest contribution to science and to the community…Derek R. Smith, Professor, University of NewcastleLicensed as Creative Commons – attribution, no derivatives.tag:theconversation.com,2011:article/327402014-10-12T19:09:04Z2014-10-12T19:09:04ZMeasure for measure: the creative arts and the ‘impact agenda’<figure><img src="https://images.theconversation.com/files/61356/original/cwhwdcnk-1412914375.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=496&fit=clip" /><figcaption><span class="caption">You need to know Shakespeare to judge it, not the other way round.</span> <span class="attribution"><span class="source">orangechallenger</span>, <a class="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">CC BY-NC-SA</a></span></figcaption></figure><p>What use are Shakespeare’s plays? Back in the day, when my wife and I were dirt-poor arty types and lived in a hovel that declined the profligacy of doors, a two-volume hard-back edition of his collected works proved a handy anchor for the bedroom curtain. </p>
<p>Later, when I became a graduate student, I found the single edition paperback version a convenient support for my laptop. Also I once landed a job because my boss had an enthusiasm for Shakespeare’s history plays and when asked at my job interview what I was reading I could say “the history plays”. So: a doorstop, a computer rest, a facilitator of gainful employment.</p>
<p>But such applications of Shakespeare are probably not what most consider his obvious use. His dramas are among the most profound and astonishing artworks we possess, and it is these qualities that cause us to value him. If he supplies handy visuals for holiday postcards or quotes for crossword puzzles, all well and good. But the main use of Shakespeare’s plays inheres in our experience of them as plays. </p>
<p>How to measure the use of those plays? How to measure the use of any creative arts? This issue, incidental for most people most of the time, is a matter of intense interest in certain quarters some of the time. </p>
<h2>Measuring the creative arts</h2>
<p>The recent, inaugural conference of the <a href="http://ddca.edu.au/?page_id=4399">Australian Council of Deans and Directors of Creative Arts</a> (DDCA) “the new national organisation, representing learning, teaching and research in the creative arts in Australia, with a membership of more than 22 universities and other higher education institutions”. </p>
<p>A day talking about the issues facing creative arts research in Australia saw persistent themes emerge. As every academic knows, the 2015 national research assessment exercise (ERA) is now underway. The introduction of new assessment categories in 2010 allowed “non-traditional outputs” to be put forward for the first time. </p>
<p>ERA has not yet replaced the Higher Education Research Data Collection (HERDC), an index which captures only some traditionally-published outputs. But ERA now clearly overshadows it, being more comprehensive and hopefully, supposedly, a better measure of the totality of the research undertaken by tertiary institutions all over the country.</p>
<p>There were some impressive speakers at the conference and the tone was optimistic without being blithe or over-emphatic. </p>
<p>Paul Gough, a new Pro Vice-Chancellor at RMIT University, and a veteran of research assessment in the UK, gave a thoughtful, astute keynote laying out the features of contemporary indices and their likely future development. Tim Cahill, Director of Research Evaluation at the Australian Research Council (ARC), gave a thoughtful, astute response to Professor Gough. </p>
<p>Later Professor Margaret Sheil, a chemist, made a celebrity-like appearance. Creative arts researchers have good reason to be grateful to the author of Reflections on the Development of Orthogonal Acceleration Time-of-Flight Mass Spectrometry, since it was Sheil in her capacity as CEO of the Australian Research Council when ERA was established, who ushered in the new assessment categories. </p>
<p>Sheil is the artist’s ideal of a scientist: no-nonsense, impatient with “the reduction of everything down to short-term utility”, committed to intrinsic discovery, “quality projects done in a high quality way”. Hard not to cheer her on. I should have bought my football rattle. </p>
<p>There is no doubt the creative arts have done well out of ERA, that they now have a seat at the research table. What is more impressive is that a largely instrumental exercise – one about ranking and funding – has nevertheless promoted a greater degree of cross-disciplinary awareness. </p>
<figure class="align-center zoomable">
<a href="https://images.theconversation.com/files/61353/original/v565bx66-1412913835.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=1000&fit=clip"><img alt="" src="https://images.theconversation.com/files/61353/original/v565bx66-1412913835.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&fit=clip" srcset="https://images.theconversation.com/files/61353/original/v565bx66-1412913835.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=600&h=600&fit=crop&dpr=1 600w, https://images.theconversation.com/files/61353/original/v565bx66-1412913835.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=600&h=600&fit=crop&dpr=2 1200w, https://images.theconversation.com/files/61353/original/v565bx66-1412913835.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=600&h=600&fit=crop&dpr=3 1800w, https://images.theconversation.com/files/61353/original/v565bx66-1412913835.jpg?ixlib=rb-1.1.0&q=45&auto=format&w=754&h=754&fit=crop&dpr=1 754w, https://images.theconversation.com/files/61353/original/v565bx66-1412913835.jpg?ixlib=rb-1.1.0&q=30&auto=format&w=754&h=754&fit=crop&dpr=2 1508w, https://images.theconversation.com/files/61353/original/v565bx66-1412913835.jpg?ixlib=rb-1.1.0&q=15&auto=format&w=754&h=754&fit=crop&dpr=3 2262w" sizes="(min-width: 1466px) 754px, (max-width: 599px) 100vw, (min-width: 600px) 600px, 237px"></a>
<figcaption>
<span class="caption"></span>
<span class="attribution"><span class="source">Jef Safi</span></span>
</figcaption>
</figure>
<p>Not only do the creative arts now “count” – their contribution is better understood throughout universities as a result of being counted. At least potentially. It all depends on how an assessment exercise is undertaken, the context of its application, and the values it serves.</p>
<p>Here it is important to proceed with some caution. </p>
<p>Anyone who heard Cambridge University Professor Stefan Collini <a href="http://www.abc.net.au/radionational/programs/bigideas/what27s-happening-to-universities3f/5778848">on ABC Radio recently</a> should feel justifiable suspicion about the role and cost of research assessment indices, particularly those around the “impact” agenda. </p>
<p>Collini’s analysis of the UK “impact” experience is comprehensive, devastating and deep. The nub of his argument is that the consumer model of higher education has led to a switch from “specifying aims to measuring outcomes” and the recasting of research purposes into quantifiable measures of end-user satisfaction. </p>
<p>When applied to research, “impact” gathers “evidence of incidental byproducts or side effects”, data that is expensive and time-consuming to obtain, a poor proxy for quality and only tangentially related to the substance of a research project. Overall, impact is “a textbook example of the way a misconceived system of accountability can end up determining the character of the activity it is only designed to monitor and measure”. </p>
<h2>Impact for Arts Deans</h2>
<p>How then should the DDCA proceed? And what contribution can creative arts researchers make to the research assessment debate? </p>
<p>These questions are both charged and complicated. But two things can be said up front. </p>
<p>The first is that those engaged in leading research assessments should consider their purpose and not simply their parameters. If the results of such exercises are to be used in puerile ranking games or by governments blindly intent on making across-the-board budget cuts, measures of research output are in bad faith, voiding the democratic principles on which they are founded. </p>
<p>The political analysis must be as sophisticated as the methodological one. Just because something can be counted, doesn’t mean it should be counted. </p>
<p>Following Collini, it is important to know that a research index honours the values it claims as its base motivation. Individual researchers have a duty to enforce a degree of bottom-up scrutiny. So far as the impact agenda in Australia is concerned the next step is clear: we need more information about the UK experience before bringing its measures here.</p>
<p>The second thing to be said relates specifically to the creative arts. For a declining number of “real” academics creative arts research remains an oxymoron. ERA’s great boon has been to reveal this for the prejudice it is. Creative arts activity can be “real” research just as some traditional publications are so poor as to be unworthy of the name. </p>
<p>Now – this is something I hear quite a bit – the creative arts no longer have to engage in “special pleading”. Now they are “just like everything else”.</p>
<p>But they aren’t. </p>
<p>Quite obviously the “uses” of the creative arts are particular, if “use” is the right word, which it probably isn’t. Collini talks about the fetishisation of quantification that feeds public debate in our contemporary, over-connected, hyper-competitive, market-mad society. What gets ditched in the scramble for attention and funds is anything long-form or unfashionable.</p>
<p>Ethical arguments, say, or reasons that can’t be put into a sound-bite or twitter-feed. </p>
<p>To assess a Shakespeare play you need not a snap measure of use-value, but knowledge of blank verse, characterisation and Elizabethan staging. You need to know it to judge it, not the other way round. </p>
<p>Which doesn’t mean you can’t measure some indicators of, say, a production of Hamlet (“change in social behaviour in relation to the reduction of regicide”?). It does mean that such measures need to be placed in a context that broadly makes sense.</p>
<p>Here the creative arts can do a favour to other disciplines as well as their own. </p>
<p>Because they so clearly need specific consideration (quite different from “special pleading”) they force governing authorities to come clean about their over-arching aims.</p>
<p>It is not only universities that hide behind methodology. Data is the new bunkum, and not only for the sake of qualitative issues, but for the sake of continuing faith in quantitative analysis, it is vital that the research assessment debate continue to dialogue with areas like the creative arts that have their uses but will always elude precise numerical representation. </p>
<p>“Though this be madness, yet there is method in ‘t,” noted the wily Polonius. Sometimes, though, it can be the other way round.</p><img src="https://counter.theconversation.com/content/32740/count.gif" alt="The Conversation" width="1" height="1" />
<p class="fine-print"><em><span>Julian Meyrick does not work for, consult, own shares in or receive funding from any company or organisation that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.</span></em></p>What use are Shakespeare’s plays? Back in the day, when my wife and I were dirt-poor arty types and lived in a hovel that declined the profligacy of doors, a two-volume hard-back edition of his collected…Julian Meyrick, Professor of Creative Arts, Flinders UniversityLicensed as Creative Commons – attribution, no derivatives.