MIT Researchers Leverage AI to Uncover Promising Class of Antibiotic Candidates
Using a type of artificial intelligence known as deep learning, MIT researchers have discovered a class of compounds that can kill a drug-resistant bacterium that causes more than 10,000 deaths in the United States every year.
In a study appearing today in Nature, the researchers showed that these compounds could kill methicillin-resistant Staphylococcus aureus (MRSA) grown in a lab dish and in two mouse models of MRSA infection. The compounds also show very low toxicity against human cells, making them particularly good drug candidates.
A key innovation of the new study is that the researchers were also able to figure out what kinds of information the deep-learning model was using to make its antibiotic potency predictions. This knowledge could help researchers to design additional drugs that might work even better than the ones identified by the model.
“The insight here was that we could see what was being learned by the models to make their predictions that certain molecules would make for good antibiotics. Our work provides a framework that is time-efficient, resource-efficient, and mechanistically insightful, from a chemical-structure standpoint, in ways that we haven’t had to date,” says James Collins, the Termeer Professor of Medical Engineering and Science in MIT’s Institute for Medical Engineering and Science (IMES) and Department of Biological Engineering.
Felix Wong, a postdoc at IMES and the Broad Institute of MIT and Harvard, and Erica Zheng, a former Harvard Medical School graduate student who was advised by Collins, are the lead authors of the study, which is part of the Antibiotics-AI Project at MIT. The mission of this project, led by Collins, is to discover new classes of antibiotics against seven types of deadly bacteria, over seven years.
Explainable predictions
MRSA, which infects more than 80,000 people in the United States every year, often causes skin infections or pneumonia. Severe cases can lead to sepsis, a potentially fatal bloodstream infection.
Over the past several years, Collins and his colleagues in MIT’s Abdul Latif Jameel Clinic for Machine Learning in Health (Jameel Clinic) have begun using deep learning to try to find new antibiotics. Their work has yielded potential drugs against Acinetobacter baumannii, a bacterium that is often found in hospitals, and many other drug-resistant bacteria.
These compounds were identified using deep learning models that can learn to identify chemical structures that are associated with antimicrobial activity. These models then sift through millions of other compounds, generating predictions of which ones may have strong antimicrobial activity.
These types of searches have proven fruitful, but one limitation to this approach is that the models are “black boxes,” meaning that there is no way of knowing what features the model based its predictions on. If scientists knew how the models were making their predictions, it could be easier for them to identify or design additional antibiotics.
“What we set out to do in this study was to open the black box,” Wong says. “These models consist of very large numbers of calculations that mimic neural connections, and no one really knows what’s going on underneath the hood.”
First, the researchers trained a deep learning model using substantially expanded datasets. They generated this training data by testing about 39,000 compounds for antibiotic activity against MRSA, and then fed this data, plus information on the chemical structures of the compounds, into the model.
“You can represent basically any molecule as a chemical structure, and also you tell the model if that chemical structure is antibacterial or not,” Wong says. “The model is trained on many examples like this. If you then give it any new molecule, a new arrangement of atoms and bonds, it can tell you a probability that that compound is predicted to be antibacterial.”
To figure out how the model was making its predictions, the researchers adapted an algorithm known as Monte Carlo tree search, which has been used to help make other deep learning models, such as AlphaGo, more explainable. This search algorithm allows the model to generate not only an estimate of each molecule’s antimicrobial activity, but also a prediction for which substructures of the molecule likely account for that activity.
Potent activity
To further narrow down the pool of candidate drugs, the researchers trained three additional deep learning models to predict whether the compounds were toxic to three different types of human cells. By combining this information with the predictions of antimicrobial activity, the researchers discovered compounds that could kill microbes while having minimal adverse effects on the human body.
Using this collection of models, the researchers screened about 12 million compounds, all of which are commercially available. From this collection, the models identified compounds from five different classes, based on chemical substructures within the molecules, that were predicted to be active against MRSA.
The researchers purchased about 280 compounds and tested them against MRSA grown in a lab dish, allowing them to identify two, from the same class, that appeared to be very promising antibiotic candidates. In tests in two mouse models, one of MRSA skin infection and one of MRSA systemic infection, each of those compounds reduced the MRSA population by a factor of 10.
Experiments revealed that the compounds appear to kill bacteria by disrupting their ability to maintain an electrochemical gradient across their cell membranes. This gradient is needed for many critical cell functions, including the ability to produce ATP (molecules that cells use to store energy). An antibiotic candidate that Collins’ lab discovered in 2020, halicin, appears to work by a similar mechanism but is specific to Gram-negative bacteria (bacteria with thin cell walls). MRSA is a Gram-positive bacterium, with thicker cell walls.
“We have pretty strong evidence that this new structural class is active against Gram-positive pathogens by selectively dissipating the proton motive force in bacteria,” Wong says. “The molecules are attacking bacterial cell membranes selectively, in a way that does not incur substantial damage in human cell membranes. Our substantially augmented deep learning approach allowed us to predict this new structural class of antibiotics and enabled the finding that it is not toxic against human cells.”
The researchers have shared their findings with Phare Bio, a nonprofit started by Collins and others as part of the Antibiotics-AI Project. The nonprofit now plans to do more detailed analysis of the chemical properties and potential clinical use of these compounds. Meanwhile, Collins’ lab is working on designing additional drug candidates based on the findings of the new study, as well as using the models to seek compounds that can kill other types of bacteria.
“We are already leveraging similar approaches based on chemical substructures to design compounds de novo, and of course, we can readily adopt this approach out of the box to discover new classes of antibiotics against different pathogens,” Wong says.
In addition to MIT, Harvard, and the Broad Institute, the paper’s contributing institutions are Integrated Biosciences, Inc., the Wyss Institute for Biologically Inspired Engineering, and the Leibniz Institute of Polymer Research in Dresden, Germany. The research was funded by the James S. McDonnell Foundation, the U.S. National Institute of Allergy and Infectious Diseases, the Swiss National Science Foundation, the Banting Fellowships Program, the Volkswagen Foundation, the Defense Threat Reduction Agency, the U.S. National Institutes of Health, and the Broad Institute. The Antibiotics-AI Project is funded by the Audacious Project, Flu Lab, the Sea Grape Foundation, the Wyss Foundation, and an anonymous donor.