Our Science

Sequencing all life, explained in numbers

Darwin Tree of Life is one of the most ambitious projects in biology right now – trailblazing the path in a global quest to sequence the genomes of all eukaryotic species on Earth. Watch the video below, produced by the Sanger Institute, to see what it means to the scientists who work on the project.

Or if numbers are more your thing, we’ve got a helpful breakdown below.

70,000 species

The number of eukaryotic species (organisms with a nucleus in their cells) that we are aiming to sequence across Britain and Ireland. That’s all the known eukaryotic species living wild on these islands, including: 41,600 animal species, 8,000 plants, 18,500 fungi, and 4,300 single-celled protists.

11 partners

The number of institutions working on the Darwin Tree of Life project. This includes organisations interested in biodiversity: the Natural History Museum in London, not one but two Royal Botanic Gardens at Edinburgh and Kew, the Marine Biological Association based in Plymouth, and the University of Oxford’s Wytham Woods genomic observatory and Protist Group. It also includes institutes involved in genomic sequencing, including the Earlham Institute, Universities of Edinburgh and Cambridge, and the Sanger Institute – where the bulk of full-genome sequencing will occur. And finally, several of the partners are involved in analysis, with special mention for EMBL-EBI.

Canada geese (Branta canadensis)
Canada geese, Branta canadensis, on the Wellcome Genome Campus (Image: Luke Lythgoe, Wellcome Sanger Institute)

1.2 million species

The number of named species on Earth – the real total might be up to 10 million! The Darwin Tree of Life project is part of a wider global project to sequence the genomes of all life, known as the Earth BioGenome Project (EBP). Around 50 different projects are affiliated with EBP, each with a different focus for the species they are sequencing. Some, like ours, are biogeographical, focusing on a particular region. Some are taxonomic, for example focusing on bats or fish. Others look at specific forms of evolution, for example symbiosis in marine organisms.

40%

The proportion of taxonomic families worldwide that are also found in Britain and Ireland. Within the “tree of life”, families are relatively closely-related groups – just above the genus and species levels. That means our project on this small archipelago can offer a genomic snapshot of a pretty wide range of all life on Earth.

False puffball (Reticularia lycoperdon)
False puffball, Reticularia lycoperdon, on Bodmin Moor, Cornwall (Image: Luke Lythgoe, Wellcome Sanger Institute)

10 years

The number of years the project has set itself to reach its 70,000-species goal, kicking off in 2019 and winding down around the end of the decade.

3 years

The number of years for phase one of the project (2019-2022). In this phase we aim to collect specimens for 8,000 species, including – wherever possible – one for each of the approximately 4,200 taxonomic families. We intend to sequence and release at least 2,000 high-quality genomes by the end of 2022.

Ornate-tailed digger wasp, Cerceris rybyensis
Ornate-tailed digger wasp, Cerceris rybyensis (Image: Liam Crowley, University of Oxford)

240 genomes sequenced

The number of genomes sequenced, released and freely accessible to the global scientific community at time of writing (February 2022). Of these, 70 have been published as short scientific papers that we call “Genome Notes”. The first phase of this project has been particularly exciting, focusing on the dizzying logistics of this epic undertaking. Much has now been tested, implemented and shared with global partners – meaning 2022 will see a significant ramping up of the number of genomes we can sequence.

Unknown

The number of applications these genomes will have for researchers across countless scientific fields. We are creating a comprehensive and open genomic library of life, and expect this to transform the way we do biology. That may be for ecologists and conservationists, looking to stem the tide of the sixth global mass extinction. They can look to genomics when asking why some species succumb to climate change or pollution whereas others adapt and thrive. It may be pharmacologists seeking new drugs in the venoms or mucous of under-studied marine organisms. Industry could find new chemical compounds, helping us create new sustainable fuels or plastic substitutes. Agriculture may benefit from the more climate-resilient crops.

Common poppy (Papaver rhoeas)
Common poppy, Papaver rhoeas, in Cambridgeshire (Image: Luke Lythgoe, Wellcome Sanger Institute)

We won’t know all the genomic treasures of the natural world until we have sequenced everything – and that is what the Darwin Tree of Life team have set out to do!