The UK is a world leader in sequencing SARS-CoV-2, the virus that causes COVID-19. Of all the coronavirus genomes that have been sequenced in the world, nearly half have been sequenced by COVID-19 Genomics UK Consortium (Cog-UK). The consortium began life on March 4 when Sharon Peacock, a professor of public health and microbiology at the University of Cambridge, emailed a handful of scientists and asked for their help. The Conversation spoke to Professor Peacock about that day and what happened after.
Q: When did you first get the idea to set up Cog-UK? And how was it formed?
In late February 2020, it dawned on me that we were going to need genome sequencing capabilities across the UK for the novel coronavirus. It was predictable that the virus was going to develop mutations that could become problematic.
On March 4, I emailed five colleagues, asking if they’d be interested in helping me set up a UK sequencing consortium. A week later we met at the Wellcome building on Euston Road in London with the aim of thrashing out a plan. We looked to draw in people who might be able to help us put together a blueprint and a network for sequencing in the UK.
There were about 20 people in the meeting. They were clinical virologists, experts in human genomes and pathogen genomes, epidemiologists and immunologists. During that day, we worked through what we thought an end-to-end sequencing pipeline would be, and we debated whether the sequencing would be centralised or distributed or both, and who would do what. By the end of the day, we had the blueprint.
The notes from the meeting were written up into a formal proposal for Sir Patrick Vallance, the UK government’s chief scientific adviser.
It’s unusual because if you have four public health agencies and lots of researchers from different institutions and the NHS, it would take a year or more to do something like that normally. But we just sat down and did it, and that’s how Cog-UK was born.
Q: How did you get funding?
The application was on Sir Patrick Vallance’s desk by March 15. He and Professor Chris Whitty, the chief medical officer for England, had what they called a “COVID-19 fighting fund”. They reviewed our proposal and strongly supported it.
I also contacted Sir Mike Stratton, director of the Wellcome Sanger Institute in Cambridge. I asked Mike if they could support us as they have the technology to do large-scale sequencing. He said yes, and since then, Sanger has contributed a great deal.
So at the outset, we got about £14.5 million from the government, plus in-kind funding from Sanger, which together came to a total of around £20 million.
We started on April 1, but we’d already done quite a lot of sequencing by then. About 260 coronavirus sequences were already in the bag.
Q: So the sequencing began even before Cog-UK was launched?
Because a lot of people had sequencing instruments and expertise, they had already started work. There are sequencing instruments in labs across the country. We hadn’t catalogued where or what at that stage. And, in fact, we weren’t particularly prescriptive about what types of sequencing instruments we asked the labs to use. People used what they thought worked well for them.
Q: How did other scientists react?
They were hugely supportive. Some people were worried that the virus would not accumulate enough mutations to make it worth our while. It would mean that we would end up sequencing the same virus over and over again because it only mutates once or twice a month. It could all have been a waste of time.
What we hadn’t bargained for was the 100 million cases – but perhaps even as high as a billion, if you include undiagnosed cases. And each time the virus infects a person it has an opportunity to make a mistake in its genome.
We considered the risk of lack of genetic variation, but went ahead. What we did was rather bold at the time.
Q: How does it work in practice, from the time someone is swabbed to the time the sequence is uploaded onto the shared Gisaid database that holds all of the world’s sequences of SARS-CoV-2?
Laboratory testing for COVID-19 using the so-called PCR test in the UK is roughly divided into two testing pathways. If you are hospitalised with COVID-19, your sample will get tested in a local laboratory. We call that pillar one.
Cog-UK collects samples from about 90 different laboratories at the moment, which is quite a logistical challenge. These are sent to regional sequencing hubs that focus mostly on sequencing from their region. These are really important samples because they are from the sickest people with COVID-19.
Pillar two testing is done in the Lighthouse labs, which were set up to analyse community testing samples. These are sequenced at the Wellcome Sanger Institute.
We also provide sequencing to major government projects, like the Office for National Statistics study. We also support the React study, [a major programme of home testing for COVID-19 to track the progress of the infection across England] and vaccine trials.
We can’t sequence all of the positive samples at the moment. When we first started, we were aiming for a minimum of 10%. At the moment it’s under 10%, but we hope to get to around 20%, and we’ll build from there.
Q: And as a total of the viruses sequenced in the world, what proportion is Cog-UK sequencing, and how does it compare with other countries?
We have sequenced about 45% to 48% of all SARS-CoV-2 genomes in the Gisaid database.
Q: Given the importance of tracking mutations, are other countries starting to increase their sequencing efforts?
Yes. The country where I think we will see a big shift is the US because of all the changes they’re making in their response to the pandemic. I would anticipate quite a few other countries beginning to come up, too. I know that Germany is looking to increase its sequencing capacity. But there are some really big gaps in the map.
Q: Worrying coronavirus variants have been widely reported on in the last few months. The so-called “UK variant”, B117, was raised as a concern in November, but the sample was from September. Is that right?
Yes, September 20. There were very few cases of B117 initially, and it’s one of hundreds of different variants. So there was no reason to be concerned about it initially. We are learning all the time about which mutations might be important, particularly when they crop up all around the world. So the first time the UK variant was in the database, you probably wouldn’t give it a second thought. It’s only once you start to learn about what the mutations really mean, or when an event occurs, that you start to zoom in on specific variants. And with B117, Public Health England noticed that there was a surge in cases in Kent, which was odd because there was a lockdown and there weren’t any surges elsewhere. That was a striking observation.
So that could be due to human behaviour, such as a super-spreader event. It was at that point, towards the beginning of December, that it became clear that there was not only a surge in cases, but those cases were caused by B117. It had a really striking genome in that it had 23 mutations, which were far more than we were used to seeing. That’s when researchers began to find evidence that it was more transmissible. And it took a bit longer to do the essential science so that we could be certain that this variant was indeed associated with increased transmission.
Q: Why are we suddenly seeing all of these mutations that give the coronavirus an advantage now?
It’s not the first time that we have observed mutations that have given the virus an advantage. At the end of March 2020, we noticed something for the first time in the UK: a mutation in the spike protein called D614G. This wasn’t in the original virus that was first detected in China. But the virus with this mutation rapidly expanded and replaced the other viral lineages circulating at the time.
We talked about this at Sage, the government’s scientific advisory group for emergencies, quite early on. And we calculated that it caused an increase in the R0, which represents the average number of people infected by one infectious individual. So we knew then that this type of event could happen – it was a practice run for more serious variants to come.
The D614G mutation gave the virus a modest increase in transmissibility. But it swept across the world. It’s now present in almost all SARS-CoV-2 viruses.
The next variant to worry people emerged in Denmark and was related to SARS-CoV-2 being transmitted between mink and people – referred to as the “cluster 5 variant”. People were concerned that evolution had been accelerated by passage through mink and had been transmitted back to humans. But only 12 people in Denmark were ever found to have that variant. So that fizzled.
A third worrying variant emerged in Spain in the summer. It seemed to be spreading very quickly around Europe. One possible reason for this was a particular mutation in the spike protein. But over time it became clear that it was being transmitted because people were moving around on their summer holidays. There was no evidence that it was more transmissible.
We also reported to Sage another mutation in the spike protein last October, called N439K. And that change in the spike protein appears to affect the body’s immune response, at least based on laboratory experiments.
So the idea that variants have only just arisen is not the case. We’ve been talking about variants since the early days of the pandemic, which might surprise some people.
Q: Is the original virus from Wuhan still around?
Lineages can expand and then go extinct, so we don’t expect the same lineage to necessarily be around forever. This was shown by work in Wales and in Scotland, where they looked at the lineages in the first wave and then in the second wave.
In the first wave, these were largely imported from Europe. In the summer as cases fell, most of those original lineages disappeared. Then, in the second wave, numerous new lineages were introduced from overseas, which kicked off the second wave. So it’s quite a dynamic process. As particular lineages have a fitness advantage, then that is probably what is circulating at any particular time.
Q: Is there a base type that you compare changes against? And is it the original virus or the current dominant variant?
We compare changes against the original virus sequenced in Wuhan in January 2020 – it’s the reference genome. But it’s quite confusing because different groups use different names and different naming conventions. I hope that the World Health Organization will help us to reach a common international nomenclature.
It worries me that people name variants after where they were first identified. Evolution is not a function of geography, it’s a function of nature. I very much hope that we move away from calling coronaviruses, the UK variant or the South African variant or the Brazilian variant. I tend to try and say the variant first detected in South Africa, or whatever. Because it could be quite stigmatising in the longer term.
Q: Is a certain amount of evolutionary selective pressure created when we start to vaccinate lots of people? Or is the greater number of people in which the virus has the opportunity to mutate the greater problem of the two?
At the moment, I think it’s the number of cases that is important because the variant detected in the UK emerged when vaccines weren’t yet being rolled out, but when cases were high. And the same is true in South Africa and Brazil.
Some people have contacted me to say: “Do you think it was the vaccine trials that led variants to emerge?” But if you compare the relatively small number of people who’ve been in vaccine trials versus the very large number of people who are infected – 100 million people infected. I think the biggest driver of mutation emerging is the number of opportunities the virus has had to mutate.
And people say, “Well, isn’t the vaccine going to drive the emergence of new variants?” It may be one of the pressures, but if you’re in a population where, say, 50% have been infected and have so-called “natural immunity”, then it doesn’t matter how you get the immunity to the virus, the virus will try to find a chink in that armour. But in that instance, it’s a naturally acquired infection rather than immunisation. So this problem has been around long before vaccination.
Q: And of the variants of concern that we know of, which one is the most worrying?
Right now, it’s the variant first detected in South Africa. It has already been reported in 31 countries and identified in 750 sequences so far. Although this is probably a gross underestimate because quite a few countries that surround South Africa do not have sequencing capacity at the moment. This variant appears to be more transmissible in South Africa and reduces the effectiveness of our immune response, be that from natural infection or vaccination.
P1 is also on the watch list. This variant first identified in Brazil has mutations associated with being more transmissible and with a reduced immune response. If you look at the global spread of P1 though, unlike some of the other variants, I don’t really see it taking hold at the moment. It’s been linked to just nine countries so far.
I’m also looking at what else might be emerging in the coming weeks and months. What I’m particularly concerned about is that now that B117 causes almost all COVID-19 cases in the UK – what new mutations will arise in this? This new variant is likely to start to develop constellations of different mutations in its descendants. And what I’m watching for is something like E484K, the “escape mutation”, being increasingly found in B117. So far, this has arisen independently several times and includes a cluster of cases in Bristol and south-west England, but the number of cases is low.
Q: How important is what Cog-UK does to the vaccine effort?
Sequencing is absolutely integral to vaccine development. We are going to need to have sequence data.
We’re going to need to keep sequencing for the foreseeable future so that we can adapt our vaccines to keep them effective. It’s going to be a long-term job to run the two in parallel. Vaccine manufacturers are already working to tweak their vaccines for the South Africa variant, for example, to make sure it’s going to be effective against that variant.
There are going to be new variants arising in the future and we’re going to have to adapt our response to these as we go along. Sequencing and vaccine development are key partners. I suspect that this is going to be ongoing throughout my life and beyond. Of concern is that we don’t have global coverage, so we are not sighted globally in terms of new variants.
Q: So it’s going to be like the flu vaccine every year, depending on how long immunity lasts?
Yes, quite similar, but it might be a bit less predictable than a single vaccine booster each year. SARS-CoV-2 could ratchet up its characteristics over time, and the diversity of mutation combinations in different variants could change over time. So it could be more complex than flu.
We’ve also known that immunity wanes over time. So we’re going to have to be thinking about long-term strategies with this virus.