Image courtesy of NIGMS
A new tool can detect genetic alterations that have proven difficult to identify, according to research published in Nature Methods.
The tool is an algorithm called CONSERTING (Copy Number Segmentation by Regression Tree in Next-Generation Sequencing).
Researchers created CONSERTING to improve their ability to detect copy number alterations (CNAs) in the information generated by whole-genome sequencing techniques.
The group showed that CONSERTING could detect CNAs with better accuracy and sensitivity than other techniques, including 4 published algorithms used to recognize CNAs in whole-genome sequencing data.
The data the team analyzed encompassed the normal and tumor genomes from 43 children and adults with leukemia, brain tumors, melanoma, and retinoblastoma.
“CONSERTING helped us identify alterations that other algorithms missed, including previously undetected chromosomal rearrangements and copy number alterations present in a small percentage of tumor cells,” said study author Xiang Chen, PhD, of St. Jude Children’s Research Hospital in Memphis, Tennessee.
“[C]ONSERTING identified copy number alterations in children with 100 times greater precision and 10 times greater precision in adults,” added Jinghui Zhang, PhD, also of St. Jude.
Using CONSERTING, the researchers discovered genetic alterations driving pediatric leukemia, low-grade glioma, glioblastoma, and retinoblastoma.
The algorithm also helped the team identify genetic changes that are present in a small percentage of a tumor’s cells. The alterations may be the key to understanding why tumors sometimes return after treatment, they said.
Dr Zhang said CONSERTING should make it easier to track the evolution of tumors with complex genetic rearrangements.
St. Jude has made CONSERTING available to researchers free of charge. The software, user manual, and related data can be downloaded from http://www.stjuderesearch.org/site/lab/zhang.
St. Jude researchers have also developed a cloud version of CONSERTING and related tools that can be accessed through Amazon Web Services. Instead of downloading CONSERTING, scientists can upload data for analysis.
Creating CONSERTING
Work on CONSERTING began in 2010, shortly after the St. Jude Children’s Research Hospital-Washington University Pediatric Cancer Genome Project was launched. The Pediatric Cancer Genome Project used next-generation, whole-genome sequencing to study some of the most aggressive and least understood childhood cancers.
Early on in the project, researchers realized that existing analytic methods often missed duplications or deletions of DNA segments, particularly small changes that involve a handful of genes and provide insight into the origins of a patient’s cancer.
CONSERTING has now been used to analyze data for the Pediatric Cancer Genome Project. The project includes the normal and cancer genomes of 700 pediatric cancer patients with 21 different cancer subtypes.
CONSERTING combines a method of data analysis called regression tree, which is a machine-learning algorithm, with next-generation, whole-genome sequencing. Machine learning capitalizes on advances in computing to design algorithms that repeatedly and rapidly analyze large, complex sets of data and unearth unexpected insights.
“This combination has provided us with a powerful tool for recognizing copy number alterations, even those present in relatively few cells or in tumor samples that include normal cells along with tumor cells,” Dr Zhang said.
Next-generation, whole-genome sequencing involves breaking the human genome into about 1 billion pieces that are copied and reassembled using the normal genome as a template.
CONSERTING software compensates for gaps and variations in sequencing data. The sequencing data is integrated with information about the chromosomal rearrangements to find CNAs and identify their origins in the genome.