Human Genome Fully Mapped: Two Decades in the Making
UC Santa Cruz
In 2003, the Human Genome Project (“HGP”) announced that it had sequenced 92 per cent of a human genome, revolutionizing our understand- ing of genetics and human health. Now, two decades later, the HGP announced it has filled in the gaps, sequencing an entire human genome.
A genome is a complete set of an organism’s DNA. Genome sequencing is the process of determining the complete sequence of nucleotides in a genome. It’s akin to chopping up the world’s largest book into individual letters, and then rearranging them into the correct order.
The Human Genome Project is a consortium of international scientists from 20 different institutions, with a shared goal of mapping and better understanding human genes. When they made their first announcement in 2003, the HGP left about 200 million of the roughly three billion DNA letters unsequenced. That last eight per cent that took scientists 20 years to complete were the highly repetitive sections, often called “junk DNA.” Sequencing parts of a genome with a high degree of variation is a simpler process as it is easier to tell how they fit into order. With a highly repetitive section, despite seeming easier to sequence, every piece looks almost identical, making determining placement of each genome part a difficult task. These sections are also some of the most important parts of the genome, con- taining most of the variation between individual humans. These variations could provide clues as to how human ancestors underwent evolutionary changes.
To get around this blockade, scientists employed two technologies that did not exist during their initial sequencing endeavours. The first is the Oxford Nanopore DNA sequencing method, which allowed for “ultralong” reads of nucleotides (as many as
one million DNA letters per read). However, this highly efficient method also came with an increased error rate. To patch any mistakes, the scientists used the PacBio HiFi DNA sequencing method, which at its maximum output could read 20,000 DNA letters at a time with an error rate of only 0.1 per cent.
Having a completed human genome at their disposal, scientists will be able to expand their understanding of human evolution and pave the way for future medical breakthroughs. They will also be able to analyze genetic variation among individuals compared to the base genome. These comparisons could be used
to investigate links between those variations and diseases. “These parts of the human genome that we haven’t been able to study for 20-plus years are important to our understanding of how the genome works, genetic diseases, and human diversity and evolution.” said Dr. Karen Miga, a scientist at the University of California (“UC”) and leader of the consortium heading the project, in a UC press release.
The total cost of the HGP is $450 million, with the most recent research endeavour costing a couple million. Adam Phillippy, head of gene informatics at the National Human Genome Research Institute, hopes that in a decade’s time, Sequencing an indi- vidual’s genome can become everyday practice, costing less than $1000.
For researchers, their next steps will be to create a “human pan-genome reference,” a compilation of genomic information from around the world. “Because we spent all of the hard work at the outset, getting this one complete and correct, we can now start to layer on these additional genomes on top of it, and do a so-called pan-genome representation that will have this as a basis but then have all of the variation kind of branching off of it.” said Phillippy in an interview with BBC.
The goal of the Human Pangenome team is to sequence the genomes of 350 people from unique and diverse ancestral backgrounds.