News

Algorithm can enhance clustering, aid trial design


 

Study author

Chenyue Wendy Hu

Photo courtesy of Jeff Fitlow

and Rice University

A newly developed algorithm for “big data” could have a significant impact on clinical trials, according to researchers.

The algorithm, called progeny clustering, was the only method to successfully reveal “clinically meaningful” groupings of proteomic data from patients with acute myeloid leukemia.

And the algorithm is currently being used in a hospital study to identify optimal treatment for children with leukemia.

Details on progeny clustering have been published in Scientific Reports.

The authors noted that clustering is important for its ability to reveal information in complex sets of data like medical records.

“Doctors who design clinical trials need to know how to group patients so they receive the most appropriate treatment,” said author Amina Qutub, PhD, of Rice University in Houston, Texas. “First, they need to estimate the optimal number of clusters in their data.”

The more accurate the clusters, the more personalized the treatment can be, Dr Qutub said. She added that separating groups by a single data point would be easy, but when separating patients by the types of proteins in their bloodstreams, for example, it becomes more difficult.

“That’s the kind of data that’s become prevalent everywhere in biology, and it’s good to have,” Dr Qutub said. “We want to know hundreds of features about a single person. The problem is identifying how to use all that data.”

Progeny clustering provides a way to ensure the number of clusters is as accurate as possible, Dr Qutub said. The algorithm extracts characteristics about patients from a data set, mixing and matching them randomly to create artificial populations—the “progeny” of the parent data. The characteristics appear in roughly the same ratios in the progeny as they do among the parents.

These characteristics, called dimensions, can be anything: as simple as hair color or place of birth, or as detailed as blood cell count or the proteins expressed by tumor cells. For even a small population, each individual may have hundreds or thousands of dimensions.

By creating progeny with the same dimensions of features, the algorithm increases the size of the data set. With this additional data, the distinct patterns become more apparent, allowing the algorithm to optimize the number of clusters that warrant attention from doctors and researchers.

Dr Qutub said this technique is just as reliable as state-of-the-art clustering evaluation algorithms, but at a fraction of the computational cost. In lab tests, progeny clustering compared favorably to other popular methods.

And it was the only method to provide clinically meaningful groupings in an acute myeloid leukemia reverse-phase protein array data set.

Progeny clustering also allows researchers to determine the ideal number of clusters in small populations, Dr Qutub noted.

The algorithm was used to design an ongoing trial involving leukemia patients at Texas Children’s Hospital.

“Progeny clustering allowed them to design a robust clinical trial, even though that trial did not involve a large number of children,” Dr Qutub said. “It meant they didn’t have to wait to enroll more.”

Dr Qutub added that the algorithm could apply to any data set.

“We could just as easily use it for a population of voters to see who should get campaign materials from a candidate,” she said. “Progeny clustering has a lot of possible applications.”

Dr Qutub and her colleagues plan to make the algorithm available for free on her lab’s website.

Recommended Reading

R-ISS identifies three survival patterns in multiple myeloma
MDedge Hematology and Oncology
Pretransplant support helps cancer patients sleep better
MDedge Hematology and Oncology
Febuxostat better than allopurinol for preventing tumor lysis syndrome
MDedge Hematology and Oncology
Tool that lets patients report AEs proves reliable
MDedge Hematology and Oncology
How CLL evades the immune system
MDedge Hematology and Oncology
Selinexor dose lowered due to sepsis in AML patients
MDedge Hematology and Oncology
Groups draft guidelines for acute leukemia
MDedge Hematology and Oncology
How religion affects well-being in cancer patients
MDedge Hematology and Oncology
Lenalidomide can treat pulmonary sarcoidosis in MDS
MDedge Hematology and Oncology
Physical activity can benefit kids with cancer
MDedge Hematology and Oncology