All academic metrics are flawed, but some are useful

A good scoring system can help the best rise to the top. Michael Coghlan/Flickr, CC BY-SA

Why are bean counters so fixated on counting? Why are universities overrun by metrics? Are we heading for a world where we know the cost of everything and the value of nothing?

Is an obsession with metrics really corrupting science?

To me metrics are like models. To paraphrase George Box: all metrics are flawed but some are useful. What they are useful for is providing opportunities, helping install meritocracies and breaking through entrenched social elites.

A number of opportunities

My views are in part informed by my own family history. My parents were both the first in their families to go to grammar school and then to university. They went via Cyril Burt’s much maligned “11-plus” examination. The flaws of this simple “intelligence test” are many fold. There is even evidence that the research on which his approach was based was fraudulent.

Nevertheless this process was an attempt to provide opportunities to those who would benefit. Despite all the problems the test was well-intentioned and did help many people. It was one of the first steps in breaking down class structures and providing social mobility. The exams were not perfect, but they were better than the class structures that had remained in place for hundreds of years.

Ultimately, we need to use and understand metrics better. We need to discuss them and improve them. But most of all, we need to interpret them with sophistication and remember they are a good beginning but a poor end of the conversation.

Better than the alternative

Metrics don’t just affect students and student selection at schools and universities. Staff are also now subject to scrutiny.

There are many measures of productivity and quality in use. Student feedback is used to measure the quality of teaching and citations are counted to assess how well research papers are received.

Neither measure is perfect and both can certainly drive perverse outcomes if used poorly. But, again, both can be useful.

If students think a lecturer is good, it is worth knowing so that others can learn from that. If a lecturer is not appreciated, shouldn’t university management know? If papers are highly cited, is that not one indication that the research is having an impact in the world?

Metrics are a good starting point for subsequent discussions informed by expert opinion and experience on how we can improve what we do. Metrics also provide independent evidence to tax payers and other supporters that their funding is making a difference.

Some people lament the fact that metrics may influence decisions about academic hiring and promotion, but surely this is better than the decisions being confined to darkened rooms where status quo prevails.

Although all numbers have limitations, they are often superior to political intrigues and the “who you know” and “who you trust” methodology, which tends to take hold when data is absent.

The scoreboard

Some idealists seem to have a very benign view of society and believe that fairness and quality can prosper on their own without the help of systems and numbers. But others consider that work is required to drive reforms. Numbers can do the work in a non-confrontational and impartial, although imperfect, way.

One area where meritocracy already holds sway is on the sporting field. Here the scoreboards are paramount and there are some interesting effects. It is no coincidence that it is often in sport that our indigenous Australians shine most brightly. In other fields they may face prejudice, but in sport their achievements and talents cannot be denied.

While racial slurs may still occur, no-one can deny the match statistics and no coach is likely to drop a productive player from the team. In other walks of life it is much harder to agree on talent. So prejudgements, often based on the shared background and experiences of the selection panel, may come into play.

Ultimately, metrics make a good starting point for social harmony at universities or elsewhere, provided they are used by academic managers who have been through the system and understand the pitfalls. It is also important that managers actually communicate the fact that they don’t use metrics blindly, bluntly or as final arbiters of decision making.

We need to try harder to communicate the fact that metrics are not used in isolation and remember that excluding people via what appear to be dumb numbers can be extremely hurtful.

During my own highly privileged student days I played squash and rowing. In squash there were clear metrics. Any disagreement about selection could be settled quickly via a challenge.

In rowing it was the opposite. No one ever really knew who the best oarsman was. We talked at length behind the coach’s back, and those who didn’t make the team sometimes suspected the worst. The rowing machine –- “The Erg” – that measured strength helped settle a lot of questions.

Rowing machine metrics were never the final word in selecting talent for inclusion in the small crew, but they were usually a helpful place to start.