The significance of digits: just how reliable are reported numbers?
Jonathan Borwein (Jon) and David H. Bailey
When numbers of any sort are presented in mathematics, science, business, government or finance, it’s fair to say a reader assumes that the data are reasonably reliable to their last digit.
But presenting data to more digits of accuracy than is appropriate from the context can be deeply misleading, conveying much more reliability than is really present in the data.
If a light bulb is listed as using 40 watts, then its actual usage is certainly not 20 or 60 watts, but presumably between 39 and 41 watts. Or if the average interest rate paid on a set of securities is listed as 2.718%, then a reasonable reader presumes that the actual figure is between 2.717 and 2.719%.
The total number of significant digits can vary widely, depending on context. Some studies require enormous precision — we have published research studies requiring numbers to be computed to tens of thousands of digits. In other contexts, only one or two digit accuracy is appropriate.
Oil prices in 2040
By 2025, the nominal price will have hit [US]$123.90, rising steadily to [US]$177.40 by 2040. In real or inflation-adjusted terms the price will fall to [US]$95.40 by 2020 and hit [US]$101.60 by 2040, OPEC predicts.
Such impressive precision in prices projected to 2020, 2025, and 2040 is simply not defensible.
It’s hard enough to anticipate the price of oil even a few months ahead — few (if any) analysts foresaw the huge drop in oil prices in October 2014.
Any such predictions of future commodity prices (or stock prices, for that matter) are dependent on a large number of factors from costs of exploration, refining and shipping, to highly hard-to-quantify effects such as natural disasters, international political events and economic reversals.
What’s more, the technology of energy generation is changing rapidly and could drastically affect future oil prices. Already, fracking technology has dramatically increased US oil and natural gas output. This has been a major factor for the recent paradoxical drop in oil prices, in spite of horrific political developments in the Middle East and Russian imperial behaviour in the Ukraine.
In any event, we question the wisdom of repeating these figures - to the same precision - in press reports. Journalists really should have written something along the lines of:
By 2025, the nominal price may exceed [US]$120, rising steadily to over [US]$170 by 2040. In real or inflation-adjusted terms the price is projected to fall to about [US]$90 by 2020 and be over [US]$100 by 2040, according to OPEC’s current estimates.
Vagaries of data
Another reminder of the vagaries of data in the international arena was the November 7 monthly release of employment data by the US Bureau of Labor Statistics.
Lost in the good news of the addition of 214,000 non-farm jobs to the US economy in October, as well as the drop in unemployment rate to just 5.8% (in stark contrast to much of Europe, by the way), was the fact (buried at the bottom of the report) that the non-farm employment figure for August had been revised from 180,000 to 203,000, and the figure for September had been revised from 248,000 to 256,000.
In other words, an additional 31,000 people had found work, a fact not mentioned in most press reports we read.
While such adjustments are routine for US employment reports, they underscore the futility in reading too much precision into the monthly released figures — they invariably will be further refined. So shouldn’t this fact be more clearly communicated by the press?
Other dubious digits in the news
On August 14, 2012, the US Census Bureau soberly reported that at 2:29pm EDT, the US population had reached 314,159,265 residents.
While on one hand, we were pleased to see the number pi (= 3.1415926535 … ) once again in the news, nonetheless it is completely unrealistic to think that the American population can truly be pinned down even to within one million persons, much less to a single soul.
Census figures are notoriously disputable, due to factors ranging from the influx or outflow of undocumented workers, to the reluctance of some ethnic or high-risk groups to respond to any census data collection.
But this is hardly a disease limited to North America.
The European Commission’s financial framework for 2014-2020 lists its total budget as €959,988 million, the sum of similarly precise figures for each of the seven years. These figures are compared with similar figures for the period 2007-2013, also given to five- and six-digit precision.
Is this sort of data ever reliable to this level of accuracy? And even if it is (which we doubt), what is the point of presenting such data to six-digit accuracy in a public overview statement?
While we mathematicians are amused by examples such as those listed above, presenting data to appropriate levels of precision is serious business. This is particularly true in public press communications, business, finance and science policy where readers may not fully understand the context and might reasonably be misled.
In particular, falsely precise predictions and/or projections undermine the whole rationale of scientific estimation. It is essential that authors and producers of such data only present it to levels of accuracy that can truly be rigorously justified. Otherwise, one is engaging pseudoscience at the very least.
This article originally appeared on Math Drudge.Comment on this article
Jonathan Borwein (Jon) receives funding from the ARC.
David H. Bailey does not receive any funding from any organization that would benefit from this article.
University of California provides funding as a founding partner of The Conversation US.
University of Newcastle provides funding as a member of The Conversation AU.