Quirks & Quarks·Q&A

Scientists sequence complete, gap-free human genome for the first time

When the human genome was first sequenced over 20 years ago, it was a huge scientific feat, often compared to the significance of putting a man on the moon. But it was only 92% complete. Now scientists have a new edition of the human genome that fills in the blanks in the old version.

First full sequence of the roughly 3 billion chemical elements that make up our DNA.

A digital representation of the human genome. Each colour represents one the four chemical components of DNA. (Mario Tama/Getty Images)

When the human genome was first sequenced over 20 years ago, it was a huge scientific feat, often compared to the significance of the discovery of penicillin, or putting a man on the moon.

But that genome unveiled back then was only about 92% complete. There were significant gaps where researchers hadn't been able to reconstruct some of the more complex parts of our DNA.

This week, after years of tireless work, an international team of over 100 researchers has announced in the journal Science that they've produced a complete, gap-free sequence of the roughly 3 billion chemical elements, called bases, that make up our DNA.

Karen Miga is an assistant professor at the University of California in Santa Cruz, and co-lead of the Telomere-to-Telomere consortium behind the work. Here is part of her conversation with Quirks & Quarks host Bob McDonald.

Congratulations! This must be an exciting time for you to produce a new edition of the genome. But I'm curious what was missing from that first draft of the human genome? 

Well, you're absolutely right. The original Human Genome Project mapped only about 92% of the human genome sequence. The remaining sequences were complex in nature. What we're talking about a really large, persistent gaps and places that are fundamentally important for cellular processes or how our cells work.

When we talk about that 8% of the genome, what what does it mean in terms of what you were missing? 

Right. So that's all of these highly repetitive, complex structures that were organized and areas in our genome that are important for life. For example, centromeres, which were millions of bases that had been missing from our map, are critical for the genome. They're responsible for faithful chromosome transmission, every single time our cells divide. So if you have a rearrangement or something wrong in your genome in these regions, it could lead to errors that could contribute to our understanding of cancers, birth defects and other human health outcomes. So these are important regions of our genome.

I started as a graduate student thinking, why don't we know this? How are people okay with this? These are really important parts of our genome, important parts to help us understand our map, who we are, our inheritance.- Dr. Karen Miga, University of California Santa Cruz

We just didn't include them previously because they were hard. They're repeats [of genetic sequences]. Repeats doesn't necessarily mean they're nonfunctional. They're functional repeats. And in this case, that just means that there's more than one copy in the genome. And it makes it quite difficult for researchers in the past to, with confidence, assign its precise spot in its order. 

Oh, I see. So something repeats, but you don't know how many times it repeats. 

Exactly. People often compare this challenge to putting a puzzle together. Everyone's kind of familiar with these puzzles where it's easy in one area where you have a lot of information. But when you get to the blue sky or to the the region of the puzzle, where there's not a lot of context to put them together, it becomes much harder. These are those blue sky pieces that are hard. 

Well, if only 8% of the genome was missing, why did it take 20 years to fill that gap? 

Well, I want to stress 8% is still about 200 million bases. That's that's almost the size of our biggest chromosome. So it's a tremendous step. The reason these were so challenging is because of their complex nature. It really was the technology of today that gave us this opportunity. We really saw a boom in genomic technologies, cheaper, more economic. It's really the technology of these sequencing companies that stepped up and gave the researchers this type of advantage. 

What was it like for you? What did it feel like to get to 100%? 

Oh, it's been a tremendous joy. This has been something I've dreamed about since I started my career. Actually, I started as a graduate student thinking, why don't we know this? How are people okay with this? There's all these bases that are missing. And these are really important parts of our genome, important parts to help us understand, like you said, our map, who we are, our inheritance.

And I felt like it would be such an honour to be part of that legacy. And I stand with that statement. I think it's standing on the shoulders of giants once again, that there's really been a legacy here at the Human Genome Project, and I'm really pleased to be part of it at this moment. 

Karen Miga, assistant professor of biomolecular engineering at UC Santa Cruz, co-founded the Telomere-to-Telomere (T2T) consortium to pursue a complete, gapless assembly of a human genome sequence. (Carolyn Lagattuta/UCSC)

Are there plans to continue the work on the human genome? 

I think that what we're celebrating now is a technological accomplishment, and it should be stressed that this is a complete human genome, but it will soon be one of many. So I think that there's two dimensions when thinking about human reference genomes. One is that we aim for this new level of completeness. We can and we should. But the second level is that one genome or one haplotype that's from European ancestry can't fully represent the genomic diversity that we know we have in our species and globally.

And so it's really important that we broaden our reference genome to represent this type of rich genomic diversity that exists around the world. And that's happening. There is a call to action to improve the reference genome for humans to to try to represent genomic diversity across at least 350 individuals. And in this case, it's is a tremendous effort that our team's leaning into because we're trying to ensure that these genomes are all complete and present a complete picture of what human genomes look like. 

Q&A edited for length and clarity. Produced by Amanda Buckiewicz and Jim Lebans. Click on the link at the top to listen to the full interview with Karen Miga.


To encourage thoughtful and respectful conversations, first and last names will appear with each submission to CBC/Radio-Canada's online communities (except in children and youth-oriented communities). Pseudonyms will no longer be permitted.

By submitting a comment, you accept that CBC has the right to reproduce and publish that comment in whole or in part, in any manner CBC chooses. Please note that CBC does not endorse the opinions expressed in comments. Comments on this story are moderated according to our Submission Guidelines. Comments are welcome while open. We reserve the right to close comments at any time.

Become a CBC Member

Join the conversation  Create account

Already have an account?