Sections

Services

Information

UK United Kingdom

Explainer: what is a null hypothesis?

At the heart of the scientific method is the process of hypothesis testing. Given an observable phenomenon in the world, a scientist will construct a hypothesis which seeks to explain that phenomenon…

When it comes to the crunch, the null hypothesis is the only one being tested. Pimthida

At the heart of the scientific method is the process of hypothesis testing. Given an observable phenomenon in the world, a scientist will construct a hypothesis which seeks to explain that phenomenon.

Hypothesis testing is used by pharmaceutical companies to ascertain whether a drug is effective against a certain disease, by neuroscientists to determine whether neuroplasticity-based therapy helps stroke patients, by advertising businesses to decide whether a new campaign is worthwhile, and so on.

The way hypothesis testing works is by setting up two opposing hypotheses. One, the “null hypothesis”, is the reference or baseline hypothesis.

If the null hypothesis is supported, nothing unusual is going on; the factor under investigation has no explanatory power; the drug being tested has no effect; the advertising campaign doesn’t work.

But don’t be misled – this hypothesis is crucial. In reality it is the only hypothesis actually being tested.

The other side of the coin is the alternative hypothesis: the interesting and challenging contender, the hypothesis that may lead to new discoveries, decisions and advances. The drug that’s been tested does work; the advertising campaign is a smash hit.

Sleep study

Consider the fictional characters Dr Nool and Dr Altman, two researchers collaborating in the study of sleep disorders. One night, Dr Altman falls asleep on his couch after eating an apple. The following morning he knocks on Dr Nool’s door and asks: “What if apples could ease insomnia?”

To turn this simple question into something worth investigating scientifically, the doctors come up with the following hypotheses:

  • Null Hypothesis: eating apples does not improve sleep quality
  • Alternative Hypothesis: eating apples does improve sleep quality

A simple test would consist of splitting sleep disorder patients randomly into two groups. Patients in Group A (for “apple”) could be asked to eat an apple after dinner for a given period of time, while patients in Group C (for “control”) would be asked to eat a piece of fruit other than an apple.

Note the importance of being specific with the hypotheses. The question was whether apples, not fruit in general, could be used to tackle sleep disorders. Consequently it is convenient to ensure the two groups only differ in the factor under study (the effects of eating an apple).

Sample size

Hypothesis testing is essentially a statistical procedure that calculates probabilities. It is easy to see that a crucial aspect of any test is to use a large sample size.

Imagine that in our apple enquiry, there are only four patients in each group. Three patients in Group A (75%), compared to two in Group C (50%), report improvements to their insomnia after eating fruit.

Can we conclude that eating apples works to ease insomnia? Not really.

Chance alone can explain those results. But if the trial was run with 4,000 patients in each group and found the same proportions we would be much more confident in the “apple effect”.

(As an aside, next time you see a statement such as “nine out of ten doctors recommend …” refer to the small print and ask yourself how these doctors were sampled.)

Objectivity and scientific bias

Hypotheses developers and testers usually hope that the null hypothesis is rejected and their alternative hypothesis supported – that the drug they’re testing is effective; the campaign they’re running is a success; that the light is bent by gravity as predicted by Newtonian physics and Einstein’s theory of relativity …

But this desire should have no bearing on the test. The design of the experiment and the subsequent data analyses should be completely objective and unbiased.

The test must be as conservative as possible because the stakes are high: understanding nature, improving people’s health and safety, the future of a company or a new technology.

It might seem obvious, but until proven otherwise, the null hypothesis is true. As one classic textbook on statistics nicely puts it:

As in a jury trial, the burden of proof rests with the alternative hypothesis; innocent until proven guilty … When you test a hypothesis, you must act as judge and jury, but you are not the prosecutor.

Hungry for more?

So here’s a more immediate example of an alternative hypothesis: “reading this article will get you closer to the ‘two-fruit-and-five-vegies’ rule today”. C’mon, admit it: after so many references to apples and Newton you must be nibbling on some fruit by now, or at least craving it.

The accompanying null hypothesis would be something along the lines of: “reading this article has no effects on your cravings for fruit”.

And so you, dear reader, get the final say. Should we reject the null hypothesis above? You can post a comment with your reply. Once I have a large enough sample size I will look at the data.

Until then, I’ll withold judgement.

Join the conversation

34 Comments sorted by

  1. Geoffrey Edwards

    logged in via email @gmail.com

    No discernable fruit lust here.

    I am feeling sleepy, though...

    report
  2. Sue Ieraci

    Public hospital clinician

    Good article. Let's have more about research methods.

    report
  3. Tim van Gelder

    University of Melbourne

    This article fails to mention that NHST (null hypothesis significance testing) - is highly problematic, and that it is giving way to the "new statistics" which emphasizes estimation, effect sizes, confidence intervals and meta-analysis. For more on this see Geoff Cumming, Understanding the New Statistics, Routledge 2012.

    report
    1. Michael J. Lew

      Senior Lecturer, Pharmacology and Therapeutics at University of Melbourne

      In reply to Tim van Gelder

      Tim, the problems that you refer to have two main components. The first is confusion between, and hybridisation of, the Neyman-Pearsonian error-decision framework (what is properly described as hypothesis testing), and Fisher's significance testing (which has more in common with estimation than with N-P). That problem should be fixed by better instruction, but that might be difficult in the circumstance where conventional statistics textbooks encourage the use of incoherent hybrid thinking.

      The…

      Read more
    2. Paco Garcia-Gonzalez

      Ramon & Cajal Researcher at Spanish Scientific Research Council CSIC

      In reply to Tim van Gelder

      Thanks Tim and Michael for your comments. I specifically avoided talking about p values, significance thresholds, and alternative statistic frameworks for the very same reasons you outline and for space limitations. The aim was to provide a simple overview of the concepts without entering too much into statistical inference. But I agree with you both. In fact, in my field, as in many others, we are witnessing a shift towards a greater focus on effect sizes and confidence intervals rather than on specific p values (as well as an increased use of Bayesian statistics rather than classical/frequentist approaches; see for instance the two references below).

      Nakagawa, S.& Cuthill, I. C. 2007. Effect size, confidence interval and statistical significance: a practical guide for biologists. Biological Reviews 82 (4):591-605.

      Garamszegi et al. 2009. Changing philosophies and tools for statistical inferences in behavioral ecology. Behavioral Ecology 20 (6):1363-1375.

      report
  4. Dale Bloom

    Analyst

    Does eating an apple help somone to sleep?

    Apples are high in fibre, which helps fill the stomach, which helps someone to sleep if they have had little to eat.

    So eating an apple can indirectly but not directly help someone to sleep.

    Well done Dale.

    report
    1. Paco Garcia-Gonzalez

      Ramon & Cajal Researcher at Spanish Scientific Research Council CSIC

      In reply to Dale Bloom

      Well, here you’ve got a few formal hypotheses in the making! The answers lie in hypothesis testing! That’s the way science works, building up upon previous knowledge. In a way, today’s alternative hypotheses are tomorrow’s null hypotheses. In our specific example, imagine we find that eating an apple works wonders for insomniacs. This is a WHAT question. Next we could try to answer a WHY question. Why does eating apple ease insomnia? Can it be because they help filling the stomach? Let’s then devise an experiment to test this alternative hypothesis (the null hypothesis can be that the apple effect is not simply caused by the filling effect -a more challenging test). Then we could move to answer a HOW question, for instance, what are the physiological mechanisms involved in the filling effect? Note that you would need several tests to disentangle the filling and apple effects. It might well be that the two effects are independent, or that the filling only occurs when eating apples.

      report
  5. Chloe Adams

    writer

    It's not as simple as listing what the null is and what is isn't. Research is not that simple and null hypthesis testing is riddled with flaws. There are a minority of psychologists that have been requesting changes for decades, but this form of testing still continues and frankly, it put me off continuing with psychology (at honours level).
    Also, it's not just about sample size, rather the treatment of the hypothesis as an object rather than knowledge is problematic. Science is supposed to be about…

    Read more
    1. Michael J. Lew

      Senior Lecturer, Pharmacology and Therapeutics at University of Melbourne

      In reply to Chloe Adams

      Chloe, you are correct that NHST is often a hybrid and that theoreticians were at odds (at war might be closer), but while Pearson seems to have moved from his earlier joint position with Neyman, the main conflict was between Neyman and Fisher.

      The core of the conflict, in my opinion, rests on the distinction between an automatic decision system championed by Neyman and Pearson and the evidence assessment tool of Fisher. Fisher repeatedly derided the hypothesis testing framework as being appropriate…

      Read more
  6. Peter Sommerville

    Scientist & Technologist

    Paco,

    As a fellow statistician I really enjoyed this article. A simple but very simple explanation of the fundamentals of experimental design and data analysis. It will probably not appeal to many, but an understanding the principles you have so succinctly described would certainly enhance many of the exchanges that occur in TC. Thanks.

    report
    1. Paco Garcia-Gonzalez

      Ramon & Cajal Researcher at Spanish Scientific Research Council CSIC

      In reply to Peter Sommerville

      Thanks Peter. That was the point, to expose the basics of hypothesis testing. I am glad that all the points above are being discussed here because this way the readers can know more about important aspects surrounding hypothesis testing, which could not be covered in a short note.

      report
  7. account deleted

    logged in via email @gmail.com

    I don't like apples...

    I did like the article though.

    report
  8. Gil Hardwick

    anthropologist, historian, novelist, editor and publisher at eBooks West

    My view is that students are far better introduced to logical truth tables, to constructing sentences and knowing what a question is asking, to independently observing natural phenomena and establishing the validity and reliability of their own observations, which can be done in Year 5, at age 8, than by hypothesising on the basis of some otherwise unfounded challenge, too often these days political.

    The current 'scientific' regime disempowers the natural observer, the logical and reasonable thinker, the intelligent rejoinder, the simple wit.

    Sad days.

    report
    1. Stephen Prowse

      Research Advisor

      In reply to Gil Hardwick

      This seems to smack of "truthiness". If we do not base decisions on data and the analysis of data we fall back on the intelligent rejoinder, ideology and "truthiness". Yes, the wrong question might be asked, poor data collected and the wrong analysis be used but he current scientific regime will reveal that in time.

      Hence the current regime (if you can get into it) empowers the observer and the logical thinker.

      Sad as it is, the realities of politics overlay all that we do but must not stop research rigour.

      report
  9. Gavin Moodie
    Gavin Moodie is a Friend of The Conversation.

    Adjunct professor at RMIT University

    I agree with Tim van Gelder: testing null hypotheses tells one whether there is an effect, but not whether it is material or important. In this example, apples could reduce insomnia but could do so by only 0.01%. Rejecting the null hypothesis says nothing about whether it is worth eating an apple to cure insomnia.

    For brief and accessible critiques of null hypotheses and suggestions of alternatives see:

    Cumming, Geoff (2011) Significant does not equal important: why we need the new statistics…

    Read more
    1. Paco Garcia-Gonzalez

      Ramon & Cajal Researcher at Spanish Scientific Research Council CSIC

      In reply to Gavin Moodie

      Many thanks for pointing this out and for the reference; the title of the paper says it all. Just a note. In the apple-sleep survey one could ask the patients to report a YES if their sleep hours are doubled and a NO otherwise. In this case any significant p value would speak of a huge effect. This rather silly example illustrates that testing the null hypothesis and looking just at the p value can still inform on the existence of the effect and on its magnitude; it all depends on the particulars of the design and data collection. In any case I totally agree that evaluating effect size is a must. It is also worth mentioning at this point the distinction between the size or magnitude of the effect and its importance. For instance, a 0.01% change would be hardly relevant in the apple-sleep study, but it would be quite important in a test about harmful effects of radiation.

      report
  10. Phillip Ebrall

    Professor of Chiropractic at Central Queensland University

    Paco - thanks for this 'simple on the surface' piece which actually says a hell of a lot to early career researchers. As a long-term Chair of Human Ethics Research Committees, and a grant reviewer etc etc, nothing bugs me more than seeing a proposal come forward where the novice says they will 'prove' something. All they can show is that the status quo either does not change, or does change. If the null hypothesis is rejected, then the real fun actually begins, as other astute readers have commented, to sort out why.

    report
  11. Jacob Pogson

    logged in via email @sydney.edu.au

    Did Paco mention the 'two-fruit-and-five-vegies' rule because we know it to be a good suggestion, with lots of data to suggest it is healthy?

    Since the article has been discussing the null hypothesis and the alternative, and we know that rule is a pretty good rule, if we do not follow it then we are in fact subtly advocating that the 'two-fruit-and-five-vegies' rule is not a good rule.

    report
  12. Lyn Gain

    Publisher at Valentine Press

    Well done Paco - I can use this with my students. It is interesting that all except one of the other commenters chose to ignore your request for a little emprical research, albeit with an unscientific sample. Here's my data: No, the article hasn't affected either my fruit or vegie craving - in fact the tomato sandwich I ate just before reading it is sitting rather heavily on my tummy. Sorry!

    report
  13. Derek Bolton

    Retired s/w engineer

    Very disappointed that there's not a mention of the fundamental flaw in NHST, that it tells you only the probability of the outcome given the (null) hypothesis, not what you really want to know: the probability of the hypothesis given the outcome. And I've not found a single reference to Bayesian analysis amongst the comments so far.
    To illustrate the flaw, suppose you were to test 1000 silly hypothesis pairs: Venus at perihelion helps you sleep, red pyjamas help you sleep, a coin under the mattress…

    Read more
    1. Paco Garcia-Gonzalez

      Ramon & Cajal Researcher at Spanish Scientific Research Council CSIC

      In reply to Derek Bolton

      Excellent points Derek. I commented on Bayesian inference in a comment below (Reply to Tim’s and Michael’s comments) and on the key issue of significance thresholds and assessment of importance of effect sizes in another one (Reply to Gavin’s post)

      report
  14. Maxwell Walsh

    Consultant at Education

    I prefer my apple to be peeled, and cut up into bite-sized pieces. I also prefer the "Golden Delicious" variety.

    Does this affect the experimental activity, or is it just "apple-ness" being tested?

    BTW Paco - I felt absolutely no hunger pangs at all.

    report
  15. Joe Gartner

    Eating Cake

    Excuse my ignorance of scientific method, but would not th control group c have been better off having no fruit rather than a pice of non- apple?
    Would not h possibility of sleep hindering or promoting qualities in the control fruit affect the study?

    report
    1. Paco Garcia-Gonzalez

      Ramon & Cajal Researcher at Spanish Scientific Research Council CSIC

      In reply to Joe Gartner

      Hi Joe. Your first comment hits on the need to be transparent about the hypotheses that are tested and on the importance of experimental design and setting of appropriate controls. For instance, a statement such as “80% of dentists think that using toothpaste X is good for your gums” is not very informative, among other reasons because brushing your teeth with any toothpaste Y, Z, etc. can be beneficial too. The set of null/alternative hypotheses defines the control and treatment groups and vice-versa. In the article the test deals with eating apple versus eating any other kind of fruit. That is, the test tries to isolate the “apple effect” rather than examining a general “fruit effect”. But other tests are possible. It depends on the questions, assumptions, expectations…
      Your second point is very good too. Nothing is easy and one has to think carefully about causation and all the factors that can influence the results.

      report
  16. Danilla Grando

    Lecturer Clinical Microbiology

    Sorry, at this time of the year I can only think of chocolate!

    Thank you for your article, with the AQF in our minds we are currently trying to find ways that will help our students with learning objectives such as the scientific method. Your article will be a great discussion piece for our students.

    report
  17. John Kelmar

    Small Business Consultant

    The trouble with the null hypothesis is that one is not testing what they wish to examine. A far better research technique is the phenomenological approach which examines what is actually happening and then to develop a theory to match the results. This is a more open-ended approach and not for the feint-hearted as one is not sure what the results may be, and one could find that they may end up in an area which they have little knowledge - heaven forbid.
    Unfortunately the University system has grown from research using the null hypothesis, and any person who dares to be different is rejected from the University community, rather than being embraced for finding a new approach.

    report
    1. Sue Ieraci

      Public hospital clinician

      In reply to John Kelmar

      John Kelmar - there are many different research methodologies that are suited to answering different types of research questions. There are also different ways of expressing the null hypothesis - as illustrated in the last paragraphs of the article.

      In comparing two strategies, for example, the research question, expressed in the null hypothesis would be: that there is no difference between the two approaches. The study design then sets out to look for the differences.

      There is no intrinsic "trouble" with the null hypothesis - it just has to be applied appropriately. Your "phenomenological approach" is not "better" - it is just a research method. The null hypothesis is not a research method - it is a way of phrasing the research question.

      report
  18. Dania Ng

    Retired factory worker

    A great article, thanks Paco, I wish TC would be more about this kind of stuff. A clear explanation, which nicely demystifies the concept. I am not saying that the critical points raised in other comments here are irrelevant, but you're right in leaving them out since you have clearly focused on explaining something specific within complex research methods - one needs to begin with the concepts, and then build on them. Elementary stuff when attempting to get the formative stuff right; e.g., see Bloom's taxonomy of learning domains. I am going to refer to your article as a recommended reading for my students ... err, my fellow workers (at the factory :)

    report
  19. Samantha Connelly

    logged in via LinkedIn

    Reading this article did make me think of eating fruit but it is the time of year in Australia when I often think about eating fruit. All of the lovely varieties of stone fruits and berries are in season and I have been thinking about the wonderful Tasmanian cherries that I ate yesterday since I finished eating them. I find I generally think about how I will get my ‘two-fruit-and-five-vegies’ and this article made me reflect on that fact. I do not think this article made me think of it more often then usual but made me wonder if you were trying to put ideas in my head so I would agree with your hypothesis ;)

    report