Random numbers – The Conversation

Momentum isn’t magic – vindicating the hot hand with the mathematics of streaks

2017-03-27T02:38:25Z

When a player's on fire, is it hot hands? Basketball image via www.shutterstock.com.

It’s NCAA basketball tournament season, known for its magical moments and the “March Madness” it can produce. Many fans remember Stephen Curry’s superhuman 2008 performance where he led underdog Davidson College to victory while nearly outscoring the entire determined Gonzaga team by himself in the second half. Was Curry’s magic merely a product of his skill, the match-ups and random luck, or was there something special within him that day?

Nearly every basketball player, coach or fan believes that some shooters have an uncanny tendency to experience the hot hand – also referred to as being “on fire,” “in the zone,” “in rhythm” or “unconscious.” The idea is that on occasion these players enter into a special state in which their ability to make shots is noticeably better than usual. When people see a streak, like Craig Hodges hitting 19 3-pointers in a row, or other exceptional performances, they typically attribute it to the hot hand.

The hot hand makes intuitive sense. For instance, you can probably recall a situation, in sports or otherwise, in which you felt like you had momentum on your side – your body was in sync, your mind was focused and you were in a confident mood. In these moments of flow success feels inevitable, and effortless.

However, if you go to the NCAA’s website, you’ll read that this intuition is incorrect – the hot hand does not exist. Belief in the hot hand is just a delusion that occurs because we as humans have a predisposition to see patterns in randomness; we see streakiness even though shooting data are essentially random. Indeed, this view has been held for the past 30 years among scientists who study judgment and decision-making. Even Nobel Prize winner Daniel Kahneman affirmed this consensus: “The hot hand is a massive and widespread cognitive illusion.”

Nevertheless, recent work has uncovered critical flaws in the research which underlies this consensus. In fact, these flaws are sufficient to not only invalidate the most compelling evidence against the hot hand, but even to vindicate the belief in streakiness.

Sometimes it feels like a player just can’t miss. Is there a truth to this feeling, or is it a delusion? AP Photo/Michael Conroy

Research made it the ‘hot hand fallacy’

In the landmark 1985 paper “The hot hand in basketball: On the misperception of random sequences,” psychologists Thomas Gilovich, Robert Vallone and Amos Tversky (GVT, for short) found that when studying basketball shooting data, the sequences of makes and misses are indistinguishable from the sequences of heads and tails one would expect to see from flipping a coin repeatedly.

Just as a gambler will get an occasional streak when flipping a coin, a basketball player will produce an occasional streak when shooting the ball. GVT concluded that the hot hand is a “cognitive illusion”; people’s tendency to detect patterns in randomness, to see perfectly typical streaks as atypical, led them to believe in an illusory hot hand.

GVT’s conclusion that the hot hand doesn’t exist was initially dismissed out of hand by practitioners; legendary Boston Celtics coach Red Auerbach famously said: “Who is this guy? So he makes a study. I couldn’t care less.” The academic response was no less critical, but Tversky and Gilovich successfully defended their work, while uncovering critical flaws in the studies that challenged it. While there remained some isolated skepticism, GVT’s result was accepted as the scientific consensus, and the “hot hand fallacy” was born.

Importantly, GVT found that professional practitioners (players and coaches) not only were victims of the fallacy, but that their belief in the hot hand was stubbornly fixed. The power of GVT’s result had a profound influence on how psychologists and economists think about decision-making in domains where information arrives over time. As GVT’s result was extrapolated into areas outside of basketball, the hot hand fallacy became a cultural meme. From financial investing to video gaming, the notion that momentum could exist in human performance came to be viewed as incorrect by default.

The pedantic “No, actually” commentators were given a license to throw cold water on the hot hand believers.

Taking another look at the probabilities

In what turns out to be an ironic twist, we’ve recently found this consensus view rests on a subtle – but crucial – misconception regarding the behavior of random sequences. In GVT’s critical test of hot hand shooting conducted on the Cornell University basketball team, they examined whether players shot better when on a streak of hits than when on a streak of misses. In this intuitive test, players’ field goal percentages were not markedly greater after streaks of makes than after streaks of misses.

GVT made the implicit assumption that the pattern they observed from the Cornell shooters is what you would expect to see if each player’s sequence of 100 shot outcomes were determined by coin flips. That is, the percentage of heads should be similar for the flips that follow streaks of heads, and the flips that follow streaks of misses.

Our surprising finding is that this appealing intuition is incorrect. For example, imagine flipping a coin 100 times and then collecting all the flips in which the preceding three flips are heads. While one would intuitively expect that the percentage of heads on these flips would be 50 percent, instead, it’s less.

Here’s why.

Suppose a researcher looks at the data from a sequence of 100 coin flips, collects all the flips for which the previous three flips are heads and inspects one of these flips. To visualize this, imagine the researcher taking these collected flips, putting them in a bucket and choosing one at random. The chance the chosen flip is a heads – equal to the percentage of heads in the bucket – we claim is less than 50 percent.

The percentage of heads on the flips that follow a streak of three heads can be viewed as the chance of choosing heads from a bucket consisting of all the flips that follow a streak of three heads. Miller and Sanjurjo, CC BY-ND

To see this, let’s say the researcher happens to choose flip 42 from the bucket. Now it’s true that if the researcher were to inspect flip 42 before examining the sequence, then the chance of it being heads would be exactly 50/50, as we intuitively expect. But the researcher looked at the sequence first, and collected flip 42 because it was one of the flips for which the previous three flips were heads. Why does this make it more likely that flip 42 would be tails rather than a heads?

Why tails is more likely when choosing a flip from the bucket. Miller and Sanjurjo, CC BY-ND

If flip 42 were heads, then flips 39, 40, 41 and 42 would be HHHH. This would mean that flip 43 would also follow three heads, and the researcher could have chosen flip 43 rather than flip 42 (but didn’t). If flip 42 were tails, then flips 39 through 42 would be HHHT, and the researcher would be restricted from choosing flip 43 (or 44, or 45). This implies that in the world in which flip 42 is tails (HHHT) flip 42 is more likely to be chosen as there are (on average) fewer eligible flips in the sequence from which to choose than in the world in which flip 42 is heads (HHHH).

This reasoning holds for any flip the researcher might choose from the bucket (unless it happens to be the final flip of the sequence). The world HHHT, in which the researcher has fewer eligible flips besides the chosen flip, restricts his choice more than world HHHH, and makes him more likely to choose the flip that he chose. This makes world HHHT more likely, and consequentially makes tails more likely than heads on the chosen flip.

In other words, selecting which part of the data to analyze based on information regarding where streaks are located within the data, restricts your choice, and changes the odds.

The complete proof can be found in our working paper that’s available online. Our reasoning here applies what’s known as the principle of restricted choice, which comes up in the card game bridge, and is the intuition behind the formal mathematical procedure for updating beliefs based on new information, Bayesian inference. In another one of our working papers, which links our result to various probability puzzles and statistical biases, we found that the simplest version of our problem is nearly equivalent to the famous Monty Hall problem, which stumped the eminent mathematician Paul Erdős and many other smart people.

We observed a similar phenomenon; smart people were convinced that the bias we found couldn’t be true, which led to interesting email exchanges and spirited posts to internet forums (TwoPlusTwo, Reddit, StackExchange) and the comment sections of academic blogs (Gelman, Lipton&Regan, Kahan, Landsburg, Novella, Rey Biel), newspapers (Wall Street Journal, The New York Times) and online magazines (Slate and NYMag).

The hot hand rises again

With this counterintuitive new finding in mind, let’s now go back to the GVT data. GVT divided shots into those that followed streaks of three (or more) makes, and streaks of three (or more) misses, and compared field goal percentages across these categories. Because of the surprising bias we discovered, their finding of only a negligibly higher field goal percentage for shots following a streak of makes (three percentage points), was, if you do the calculation, actually 11 percentage points higher than one would expect from a coin flip!

Not just an illusion, those hands can be hot. Athlete image via www.shutterstock.com.

An 11 percentage point relative boost in shooting when on a hit-streak is not negligible. In fact, it is roughly equal to the difference in field goal percentage between the average and the very best 3-point shooter in the NBA. Thus, in contrast with what was originally found, GVT’s data reveal a substantial, and statistically significant, hot hand effect.

Importantly, this evidence in support of hot hand shooting is not unique. Indeed, in recent research we’ve found that this effect replicates in the NBA’s Three Point contest, as well in other controlled studies. Evidence from other researchers using free throw and game data corroborates this. Further, there’s a good chance the hot hand is more substantial than we estimate due to another subtle statistical issue called “measurement error,” which we discuss in the appendix of our paper.

Thus, surprisingly, these recent discoveries show that the practitioners were actually right all along. It’s OK to believe in the hot hand. While perhaps you shouldn’t get too carried away, you can believe in the magic and mystery of momentum in basketball and life in general, while still maintaining your intellectual respectability.

Joshua Miller does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond the academic appointment above.

Adam Sanjurjo does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.

How random is your randomness, and why does it matter?

2016-09-19T01:23:03Z

What if the person flipping the coin cheats? Coin and hand via shutterstock.com

Randomness is powerful. Think about a presidential poll: A random sample of just 400 people in the United States can accurately estimate Clinton’s and Trump’s support to within 5 percent (with 95 percent certainty), despite the U.S. population exceeding 300 million. That’s just one of many uses.

Randomness is vital for computer security, making possible secure encryption that allows people to communicate secretly even if an adversary sees all coded messages. Surprisingly, it even allows security to be maintained if the adversary also knows the key used to the encode the messages.

Often random numbers can be used to speed up algorithms. For example, the fastest way we know to test whether a particular number is prime involves choosing random numbers. That can be helpful in math, computer science and cryptography, among other disciplines.

Random numbers are also crucial to simulating very complex systems. When dealing with the climate or the economy, for example, so many factors interact in so many ways that the equations involve millions of variables. Today’s computers are not powerful enough to handle all these unknowns. Modeling this complexity with random numbers simplifies the calculations, and still results in accurate simulations.

Typing: A source of low-quality randomness. ROLENSFX/YouTube, CC BY-SA

But it turns out some – even most – computer-generated “random” numbers aren’t actually random. They can follow subtle patterns that can be observed over long periods of time, or over many instances of generating random numbers. For example, a simple random number generator could be built by timing the intervals between a user’s keystrokes. But the results would not really be random, because there are correlations and patterns in these timings, especially when looking at a large number of them.

Using this sort of output – numbers that appear at first glance to be unrelated but which really follow a hidden pattern – can weaken polls’ accuracy and communication secrecy, and render those simulations useless. How can we obtain high-quality randomness, and what does this even mean?

Randomness quality

To be most effective, we want numbers that are very close to random. Suppose a pollster wants to pick a random congressional district. As there are 435 districts, each district should have one chance in 435 of being picked. No district should be significantly more or less likely to be chosen.

Low-quality randomness is an even bigger concern for computer security. Hackers often exploit situations where a supposedly random string isn’t all that random, like when an encryption key is generated with keystroke intervals.

Radioactive decay: Unpredictable, but not efficient for generating randomness. Inductiveload

It turns out to be very hard for computers to generate truly random numbers, because computers are just machines that follow fixed instructions. One approach has been to use a physical phenomenon a computer can monitor, such as radioactive decay of a material or atmospheric noise. These are intrinsically unpredictable and therefore hard for a potential attacker to guess. However, these methods are typically too slow to supply enough random numbers for all the needs computers and people have.

There are other, more easily accessible sources of near-randomness, such as those keystroke intervals or monitoring computer processors’ activity. However, these produce random numbers that do follow some patterns, and at best contain only some amount of uncertainty. These are low-quality random sources. They’re not very useful on their own.

What we need is called a randomness extractor: an algorithm that takes as input two (or more) independent, low-quality random sources and outputs a truly random string (or a string extremely close to random).

Constructing a randomness extractor

Mathematically, it is impossible to extract randomness from just one low-quality source. A clever (but by now standard) argument from probability shows that it’s possible to create a two-source extractor algorithm to generate a random number. But that proof doesn’t tell us how to make one, nor guarantee that an efficient algorithm exists.

Until our recent work, the only known efficient two-source extractors required that at least one of the random sources actually had moderately high quality. We recently developed an efficient two-source extractor algorithm that works even if both sources have very low quality.

Our algorithm for the two-source extractor has two parts. The first part uses a cryptographic method called a “nonmalleable extractor” to convert the two independent sources into one series of coin flips. This allows us to reduce the two-source extractor problem to solving a quite different problem.

Suppose a group of people want to collectively make an unbiased random choice, say among two possible choices. The catch is that some unknown subgroup of these people have their heart set on one result or the other, and want to influence the decision to go their way. How can we prevent this from happening, and ensure the ultimate result is as random as possible?

The simplest method is to just flip a coin, right? But then the person who does the flipping will just call out the result he wants. If we have everyone flip a coin, the dishonest players can cheat by waiting until the honest players announce their coin flips.

A middling solution is to let everyone flip a coin, and go with the outcome of a majority of coin flippers. This is effective if the number of cheaters is not too large; among the honest players, the number of heads is likely to differ from the number of tails by a significant amount. If the number of cheaters is smaller, then they won’t be able to affect the outcome.

Protecting against cheaters

We constructed an algorithm, called a “resilient function,” that tolerates a much larger number of cheaters. It depends on more than just the numbers of heads and tails. A building block of our function is called the “tribes function,” which we can explain as follows.

Suppose there are 44 people involved in collectively flipping a coin, some of whom may be cheaters. To make the collective coin flip close to fair, divide them into 11 subgroups of four people each. Each subgroup will call out “heads” if all of its members flip heads; otherwise it will say “tails.” The tribes function outputs “heads” if any subgroup says “heads;” otherwise it outputs “tails.”

The tribes function works well if there is just one cheater. This is because if some other member of the cheater’s subgroup flips tails, then the cheater’s coin flip doesn’t affect the outcome. However, it works poorly if there are four cheaters, and if those players all belong to the same subgroup. For then all of them could output “heads,” and force the tribes function to output “heads.”

To handle many cheaters, we build upon work of Miklos Ajtai and Nati Linial and use many different divisions into subgroups. This gives many different tribes functions. We then output “heads” if all these tribe functions output “heads”; otherwise we output “tails.” Even a large number of cheaters is unlikely to be able to control the output, ensuring the result is, in fact, very random.

Our extractor outputs just one almost random bit – “heads” or “tails.” Shortly afterwards Xin Li showed how to use our algorithm to output many bits. While we gave an exponential improvement, other researchers have further improved our work, and we are now very close to optimal.

Our finding is truly just one piece of a many-faceted puzzle. It also advances an important field in the mathematical community, called Ramsey theory, which seeks to find structure even in random-looking objects.

David Zuckerman receives or received funding from the National Science Foundation, the Simons Foundation, the U.S.-Israel Binational Science Foundation, Microsoft Research, the Institute for Advanced Study, the John S. Guggenheim Memorial Foundation, the Radcliffe Institute for Advanced Study, the David and Lucile Packard Foundation, the Alfred P. Sloan Foundation, and the Texas Higher Education Coordinating Board.

Eshan Chattopadhyay receives or received funding from the National Science Foundation; University of Texas at Austin; Microsoft Research; the Institute for Advanced Study, Princeton; and the Simons Foundation.

Are Powerball drawings and ‘Quick Pick’ numbers really random?

2016-01-13T18:35:42Z

The math behind all the discussion of tonight’s Powerball drawing assumes true randomness – equal likelihood for each number to be chosen, both in the drawing itself and, crucially, in the process of assigning “Quick Picks” to ticket buyers who don’t wish to choose their own numbers.

Are those assumptions reasonable?

Imagine a bag filled with 10 red marbles and 20 blue marbles. Close your eyes, reach into the bag and pull out a marble. You might call your selection random, but more importantly, the choice of red or blue is not equally likely.

In the Powerball drawing, winning numbers are selected from two clear containers: one container has 69 white colored balls with each ball numbered in black ink with an integer from 1 to 69. The other container contains 26 red balls with each ball numbered in black ink with an integer from 1 to 26.

The balls are dropped into the respective containers and then mixed in the container by what appears to be air injected from the bottom of the container. The air is then turned off and a ball is raised from the bottom via a platform and then removed from the container. This procedure is repeated for the selection of each ball (five white and one red, the “Powerball”). Generally speaking, it seems reasonable that each ball is equally likely to be selected by this process.

It is possible – though it’s a stretch – that balls with printed numbers requiring more ink to delineate the number on the ball may weigh more due to the extra ink than balls requiring less ink. Coupled with gravity, this may be enough to keep those balls lower in the container and thus more likely to be picked by the platform. In short, the ball marked 68 may be more likely to be picked than the ball marked 1.

Luckily, this is a testable assumption. Studying the results from previous drawings would allow an assessment of whether each number is occurring with similar frequency. Without doing the statistical calculation and data collection, given the nature of this device for generating balls/numbers, it’s safe to assume that this process generates each number with equal probability.

Evaluating the “Quick Pick” numbers is more challenging. Without a machine to generate numbers with plastic balls, lottery machines nationwide have been generating numbers for ticket buyers in ways that may not give each number exactly equal chances of being chosen.

The potential problems come from the fact that computers are devices programmed by humans and so, almost paradoxically, they must be given a systematic method to choose random numbers. In computer programming terminology, this is often called generating a “pseudo random” number.

In this process, the computer may use some information, such as the computer’s real time clock with precision to a millisecond, at the time that a request for a lottery ticket was made, to trigger a process that draws five numbers and one powerball number. This beginning number is often called the “seed.” Other seeds may be created from different phenomena that presumably occur without reason or predictability. From those seeds, additional calculations generate numbers at rates that approximate randomness.

The randomness of these machines’ results can also be tested, but with more difficulty: it involves either buying large numbers of “Quick Pick” tickets or collecting ticket information from a large number of people. Analyzing the frequencies of the numbers that were generated would reveal the degree of randomness of the Quick Pick process.

Without these data, it can be illuminating to look at the number of Powerball tickets sold and the percentage of the 292,201,338 possible combinations that are covered by those tickets. These data strongly suggest that the Powerball computers are generating combinations with equal probability and thus at random.

In conclusion, it appears we have both mechanisms operating randomly and are free to compute the odds of winning, probability that there’s at least one winner, and, most importantly, our expected profits.

Jeffrey Miecznikowski does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond their academic appointment.