Our obsession with metrics is corrupting science

Not everything that can be counted counts, and not everything that counts can be counted.

– William Bruce Cameron

Australian universities have been in the media in recent weeks for the dubious treatment of overseas students and the problem of plagiarism. But they are in serious trouble for another reason: their reliance on “bibliometrics” for major decision making.

Two international companies, Thomson Reuters and Elsevier, rate the apparent prestige of the journals in which academics’ publications appear, and the frequency with which other authors refer to them, i.e. their citations. Two of the key summary results are the Hirsch index (or h-index), which reflects citations, and journal impact factor (JIF), claimed to reflect the importance of journals.

Ratings such as these dominate decisions on academic promotions, tenure, grant funding and the status of departments and universities. They have been universally adopted by universities in Australia because of perceived benefits of speed, cost-effectiveness and alleged objectivity. They underpin the government’s Excellence in Research for Australia (ERA).

This is of immediate national interest because of the links between these metrics, academic rankings and government funding of science and the universities. Also the potential harm to careers and the very way research is carried out.

The ratings have many critics: in the book Whackademia by Richard Hil, Gareth Evans, Chancellor of ANU, was quoted as saying:

Trying to get everyone to produce research to some sort of “world standard” -– whatever that means – is destined to be an absolutely ludicrous, lamentable failure.

Bahram Bekhradnia, President of the Higher Education Policy Institute, UK, went further in a recent interview in The Economist:

They’re positively dangerous. I’ve heard [university] presidents say this all over the world: I’ll do anything to increase my ranking, and nothing to harm it.

The situation is serious, so we explore some of the consequences.

Effects on careers

Evidence of the destructiveness to careers has surfaced repeatedly. One example is the case of Professor Stefan Grimm of Imperial College. Despite a strong publication record, Grimm was hounded for failing to meet funding targets.

Bibliometrics also fail to predict Nobel prizewinners.

In Australia, institutions demand publications in select journals to improve ERA success, with negative personal consequences. In the US, the Editor-in-Chief of the journal Science concluded bibliometrics were totally inadequate for assessing the potential of young scientists.

Predictably, gaming of the system for career advancement is rapidly increasing, for example, the sale of co-authorships on papers accepted into ranked journals for up to US$14,800.

Institutions game the system by avoiding high-risk or interdisciplinary research and “churning” of staff to increase scores by re-classifying staff as “non-researchers”. Is the best gaming, not the best research, leading to the best scores?

Effects on research

Empirical research on JIFs reveals their flaws. There have also been local casualties, such as the Australian journal People and Place, which was discontinued in 2010 because of pressure to restructure the journal to improve its status in the now defunct ARC journal rankings.

Disciplines suffer too. To score well in rankings, popular fields or hot topics are selected to increase citations. Other important areas such as taxonomy are undervalued and fail to attract new talent. When entire countries or continents are poorly tracked by the system, research and researchers suffer.

The use of bibliometrics in the ERA provides ammunition to those who think arts, humanities and social sciences are less worthwhile.

The journal Nature trumpets its impact factor in its own advertising. Duncan Hull/Flickr, CC BY

Effects on institutions

Universities compete to buy, not grow, talent by headhunting researchers who score well. Wealthy universities grow in prestige, but national research productivity is little changed.

In assessing the improvements by Australian Universities between ERA 2010 and ERA 2012, Frank Larkins, Deputy Vice Chancellor of the University of Melbourne, observed:

It is surprising that quality standards have improved so much in such a short time period as a result of the limited changes in the data assessed.

Universities quickly learned how to play the game and adjusted their submissions to improve their scores.

Where to now?

There has been comparatively little research providing evidence that bibliometrics achieve anything of real significance or showing there may be a fatal circularity about the whole process. Does the h-index simply measure whatever it measures without any demonstrable positive relationship to research of value to the discipline or society? Does the h-index indicate anything other than a high citation rate, whatever that may mean otherwise?

Ironically, science and hypothesis testing is needed. For example: do researchers with high h-indices contribute anything of more significance than those with lower ones?

Internationally, opposition has taken the form of the San Francisco Declaration on Research Assessment (DORA). Institutions are urged to acknowledge that the scientific content of a paper is more important than publication metrics or the identity of the journal in which it was published.

Content rather than metrics is what ought to count.

Our obsession with metrics is corrupting science

Authors

Disclosure statement

Partners

Effects on careers

Effects on research

Effects on institutions

Where to now?

Want to write?