Researchers at the University of Toronto have developed a new way to read the human genome, which could answer stubborn questions about how flaws in DNA lead to disease.
The team, led by Prof. Brendan Frey, has used "deep learning" computer technology to read the three billion characters that represent the genome, which was first sequenced in 2003. The computers then determine which proteins, the building blocks of cells, will be produced by that DNA.
It is thought to be the first application of deep learning to genetics. The study, by the almost entirely Canadian team of researchers, appeared Thursday in the journal Science.
- Analysis | Diet-based DNA testing veers into marketing
- Friends share similar DNA, study finds
- Coffee tastes influenced by DNA
- Genetic mapping study triggers new hope on schizophrenia
"What I wanted to do was connect the dots and figure out how the genome relates to biochemistry and what's going on in the cell," Frey told CBC News.
Researchers had once hoped that sequencing the genome would answer many of the same questions. But making sense of the string of three billion characters, and how their arrangements were reflected in living cells, was harder than expected.
Frey said advances in computing and other scientific developments are helping to find answers.
The use of deep learning essentially "teaches" the computer system how to read DNA, in a manner analogous to how children learn to read by matching words and pictures.
"The system finds little words … that instruct the cell's machinery to generate certain molecules."
Old approach 'futile'
It is a reversal of previous efforts, in which researchers hunted for common genetic traits among patients with particular diseases.
Frey called the old, correlative approach "futile."
If a child does not like a particular book, he said, you don't line up all the books and try to figure out which letters are prevalent.
"You don't say 'Position 103, letter G seems to be common in books he doesn't like,'" Frey said. "That's missing the point. It's the meaning of the book he doesn't like."
The study focused on introns, sections of the genome which contain instructions about how to cut and paste other sections, called exons, which then describe how to make proteins and cells.
The new technique will be applied immediately to help patients, Frey said, and has already shed new light on genetic determinants of autism, some cancers and spinal muscular atrophy, a leading genetic cause of infant mortality.
Frey said "striking patterns" emerged when the team looked into autism, revealing 39 genes that have a "potential role" in the disease.