David Balding

Professor of Statistical Genetics, The University of Melbourne

Since finishing my PhD I have worked to develop and apply mathematical/statistical/computational methods and ideas in genetics. I have contributed to aspects of population, evolutionary, medical and forensic genetics.

In forensic genetics, my principal contribution has been to develop methods to allow for coancestry effects in the interpretation of DNA profiles. Match probability formulae incorporating coancestry coefficients are often called the "Balding-Nichols formulae" following our 1994 paper. More recently I have developed methods for the interpretation of low template DNA profile evidence, initially in collaboration with John Buckleton of ESR New Zealand (2009 paper).

The forensic match formulae are based on the multinomial-Dirichlet distribution, which I developed and applied to subpopulation allele counts. This led to the first satisfactory definition of the coancestry coefficient (Fst, or theta), and a likelihood-based approach to its estimation. Embedding this distribution in a hierarchical model allowing for subpopulation and locus effects, Mark Beaumont and I in our 2004 paper developed a widely-used approach for detecting loci subject to selection (evidenced by unusually high or low variation across subpopulations). The beta-binomial (or more generally multinomial-Dirichlet) as a distribution for simulating subpopulation allele frequencies is sometimes called the "Balding-Nichols model" following our 1995 paper.

Mark and I, with Wenyang Zhang, wrote in 2002 a foundational paper in the field of Approximate Bayesian Computation (ABC). We introduced a local linear regression adjustment which has proved very useful, but perhaps just as important we provided the first useful review of the method, that had been developed in stages by earlier authors, and promoted it as a powerful and flexible statistical technique.

With Ian Wilson, I developed one of the first successful softwares (Batwing) for modelling the demographic history of populations based on explicit modelling of the genetic ancestries of individuals sampled from the populations (1998 and 2003 papers). It has been widely used, particularly to model paternal lineages from Y-chromosome data.

I've written or co-authored a number of review papers that have proved popular, on statistical methods for genetic association studies (2006), on Bayesian methods and on population structure and cryptic relatedness in genetic association (both 2009) and on genome-wide epigenetic studies (2011). Will Astle, working with me, developed a fast algorithm for mixed model analysis of genetic association studies, described in our 2009 review and available within the MixAbel section of the GenAbel R software.

Currently I continue my established pattern of very wide ranging applications of statistics in genetics. I am involved in projects on statistical methods for pharmacogenetics, including genetic covariates in pharmacokinetic models, sequencing for rare variants in inherited cardiac conditions, genomic selection in crops and breed identification in mixed-breed dogs. I also remain active in statistical methods for forensic DNA profiles.

Experience

–present

Chair in statistical genetics, UCL