Many human diseases can be traced back to errors in the genome - genetic mutations. Identifying such mutations used to be akin to searching for a tiny needle in a giant haystack, and sequencing large portions of the human genome was once a laborious process that took years. Rapid advancements in sequencing technologies have revolutionized genetics. Using modern technologies, it's now much faster to sequence a human genome. Computational tools allow scientists to compare genetic data from patients to reference sequences, and it's now far easier to find small errors in the sequence of the genome.
These newer sequencing tools rely on a workflow in which the genome is chopped into small pieces, which are then sequenced. Computational tools then put these pieces back together in order, revealing lengthy sequences. But there are drawbacks to this technique. It can miss large deletions that take out big pieces of a genomic sequence, and it's not good at sequencing highly repetitive regions. This may help explain why it's been difficult to pinpoint genetic causes for diseases that appear to have a hereditary influence, like some cardiac diseases and schizophrenia.
A new tool created by scientists at the National Institute of Standards and Technology (NIST) can approach this problem; the method enables researchers to determine how efficient they are at finding large genomic insertions or deletions. The work has been reported in Nature Biotechnology.
The Genome in a Bottle Consortium (GIAB), led by NIST, created benchmarks for detecting deletions and insertions using data from seven individuals. GIAB aims to create reference standards, data, and methods to help move genomic sequencing from the bench to the bedside.
"Just like a company making rulers could compare their ruler to a standard measuring stick to make sure it is measuring the correct distance, clinical laboratories doing DNA sequencing can measure NIST reference material DNA and compare their answer to this new benchmark to help make sure they measure large insertions and deletions well," said NIST biomedical engineer Justin Zook.
Small deletions and insertions can be found relatively easily, and have been for many years. Bigger deletions are harder to detect since "the most widely used sequencing technologies output relatively short strings of genetic code, making it hard to reconstruct what's happening," explained Zook.
The genome can be thought of as a book. "DNA sequencing is like shredding the book into smaller pieces and then trying to find any differences between the book that was shredded and a similar book, perhaps the same book before it went through editorial revisions," said Zook.
New tools have made it possible to identify bigger insertions and deletions even with small bits of DNA.
This standard will help reduce errors in a critical setting - the clinic and may make it easier for clinicians to look to the genome to find the right treatment.
Sources: AAAS/Eurekalert! via National Institute of Standards and Technology, Nature Biotechnology