Modern biological research relies on big data analytics. Vast reservoirs of memory and powerful computing ability mean machines find patterns and make meta-analyses and even predictions for scientists.
The first digits of numbers in a data set aren’t distributed equally. And now you know more than a lot of fraudsters do – and should – when they’re making up their phony numbers.
The end-of-year shopping whirlwind is underway. How does your credit card issuer watch out for fraudulent purchases on your account amid all those transactions?
Big data is about processing large amounts of data. It is often associated with multiplicities of data. But the ability to generate data outpaces the ability to store it.
Preventing crime before it happens, while saving resources, sounds like a great use of big data. But these calculated probabilities raise big questions about civil liberties.
The Investigatory Powers Bill would require ISPs to store 12 months of our web browsing history – a year-long snapshot of our thoughts, fears, interests and behaviour.
Sophisticated models and supercomputers allow researchers to create a high-fidelity map of the Earth’s trees – and show that we’re losing billions of trees a year.
Math isn’t prejudiced, goes the argument. But these arithmetic programs can learn bias from the data fed into them by human beings, leading to unfair treatment and discrimination.
Analyzing big data sets holds the promise of big insights. But the axiom “garbage in, garbage out” is particularly apt, since conclusions can be only as good as the raw data itself.
Sometimes the best way to deal with mountains of data is to turn to the public for help. That’s what Snapshot Serengeti did to classify millions of photos from savanna camera traps in Tanzania.
Collect all the data you want, but if you can’t figure out what you’re looking at, it’s useless. Topologists look for spatial relationships to figure out what the data can tell us.