“New antibiotics discovered using AI!”
That’s how headlines read in December 2023, when MIT researchers announced a new class of antibiotics that could wipe out the drug-resistant superbug methicillin-resistant Staphylococcus aureus (MRSA) in mice.
Powered by deep learning, the study was a significant breakthrough. Few new antibiotics have come out since the 1960s, and this one in particular could be crucial in fighting tough-to-treat MRSA, which kills more than 10,000 people annually in the United States.
But as remarkable as the antibiotic discovery was, it may not be the most impactful part of this study.
“Of course, we view the antibiotic-discovery angle to be very important,” said Felix Wong, PhD, a colead author of the study and postdoctoral fellow at the Broad Institute of MIT and Harvard, Cambridge, Massachusetts. “But I think equally important, or maybe even more important, is really our method of opening up the black box.”
The black box is generally thought of as impenetrable in complex machine learning models, and that poses a challenge in the drug discovery realm.
“A major bottleneck in AI-ML-driven drug discovery is that nobody knows what the heck is going on,” said Dr. Wong. Models have such powerful architectures that their decision-making is mysterious.
Researchers input data, such as patient features, and the model says what drugs might be effective. But researchers have no idea how the model arrived at its predictions — until now.
What the Researchers Did
Dr. Wong and his colleagues first mined 39,000 compounds for antibiotic activity against MRSA. They fed information about the compounds’ chemical structures and antibiotic activity into their machine learning model. With this, they “trained” the model to predict whether a compound is antibacterial.
Next, they used additional deep learning to narrow the field, ruling out compounds toxic to humans. Then, deploying their various models at once, they screened 12 million commercially available compounds. Five classes emerged as likely MRSA fighters. Further testing of 280 compounds from the five classes produced the final results: Two compounds from the same class. Both reduced MRSA infection in mouse models.
How did the computer flag these compounds? The researchers sought to answer that question by figuring out which chemical structures the model had been looking for.
A chemical structure can be “pruned” — that is, scientists can remove certain atoms and bonds to reveal an underlying substructure. The MIT researchers used the Monte Carlo Tree Search, a commonly used algorithm in machine learning, to select which atoms and bonds to edit out. Then they fed the pruned substructures into their model to find out which was likely responsible for the antibacterial activity.
“The main idea is we can pinpoint which substructure of a chemical structure is causative instead of just correlated with high antibiotic activity,” Dr. Wong said.
This could fuel new “design-driven” or generative AI approaches where these substructures become “starting points to design entirely unseen, unprecedented antibiotics,” Dr. Wong said. “That’s one of the key efforts that we’ve been working on since the publication of this paper.”
More broadly, their method could lead to discoveries in drug classes beyond antibiotics, such as antivirals and anticancer drugs, according to Dr. Wong.
“This is the first major study that I’ve seen seeking to incorporate explainability into deep learning models in the context of antibiotics,” said César de la Fuente, PhD, an assistant professor at the University of Pennsylvania, Philadelphia, Pennsylvania, whose lab has been engaged in AI for antibiotic discovery for the past 5 years.
“It’s kind of like going into the black box with a magnifying lens and figuring out what is actually happening in there,” Dr. de la Fuente said. “And that will open up possibilities for leveraging those different steps to make better drugs.”