The human genome, fully sequenced at last!
If you follow anything related to genetics, you’re probably thinking that it is old news, as the first draft was already published in 2000, but actually there were still missing parts. Now, thanks to new sequencing technologies, the human genome is completely sequenced.
In the journal Science 1, six papers published back-to-back explain how this was achieved. They not only sequenced over 6 billion base pairs, of which 200 million were sequenced for the first time, but they also identified 99 genes likely to code for proteins and 2,000 previously unknown candidate genes.
Previous versions of the human genome were lacking information on around 8% of it, corresponding to genomic regions of high complexity, mostly including highly-repetitive sequences and/or having high rates of duplications.
Up until now, it was technically impossible to accurately sequence such regions. But new technologies like Oxford Nanopore and PacBio HiFi ultra-long read sequencing allow for “reading” long stretches of DNA while also doing so with high accuracy.
Much like the initial version of the genome, the new reference genome, T2T-CHM13, freely available at the UCSC Genome Browser, was produced by another consortium, the Telomere-2-Telomere Consortium, dedicated to finally mapping each chromosome from one telomere to another. This consortium is on its way to sequencing a second full human genome to allow a better understanding of human genetic diversity.
The human genome is not a blueprint valid for each and one of us, but it is another step further in the road to understand the complexities of our genome, function, and disease.
References
- Nurk, S. et al (2022) The complete sequence of a human genome Science doi: 10.1126/science.abj6987 ↩