Traditional drug discovery techniques are all about brute force—and a little bit of luck. Basically, large-scale, high-throughput screening is used to cover a search space. The process is a little like conducting antisubmarine warfare without the benefit of sonar. Unsurprisingly, very few of the depth charges (drug candidates) hit their targets and achieve the desired results (successful clinical trials). The seas are simply too vast.
Nothing quite like sonar is available in drug discovery. But something just as helpful is being developed. It’s artificial intelligence (AI). It is already being used to explore the murky stores of data that have been accumulating over the years. Much of this data concerns small-molecule drugs. Accordingly, such drugs have been the focus of AI-driven discovery. And now, several small-molecule drugs that were created through AI technology are entering clinical trials. (Companies that have helped advance AI-designed drugs to clinical trials include Exscientia, the developer of the Centaur Chemist platform, and Recursion Pharmaceuticals, the developer of the Recursion Operating System.)
Besides enhancing the search for small-molecule drugs, AI-based systems are beginning to assist with the discovery and design of biologics. Whatever kinds of drugs and targets are of interest, AI offers two key advantages: the ability to point in new directions that medicinal chemists might have missed, and the ability to rule out areas where they might have wasted precious time.
Giving DNA-encoded libraries some AI glamour
“What [AI] really is, I think, from an application point of view, is an enabler,” says Noor Shaker, PhD, senior vice president and general manager at X-Chem, a firm that uses DNA-encoded libraries (DELs) to screen small molecules. Shaker used to be the CEO of Glamorous AI, a company that developed an AI-powered software as a service (SaaS) drug discovery engine.
X-Chem acquired Glamorous AI last October. “Glamorous AI brings cutting-edge solutions to the entire small-molecule drug discovery process,” said Matt Clark, PhD, the CEO of X-Chem. “By combining the data-generating power of our leading DEL platform with Glamorous AI’s capabilities, we will accelerate our partners’ drug discovery programs and get medicines to patients faster.”
Shaker suggests that AI-driven drug discovery is maturing: “We’ve seen AI performing successfully on the easy-to-medium-sized tasks. I think people want to see it applied to more challenging tasks where humans have really struggled. For instance, in the design of novel chemistry, AI has been really helpful in going beyond what we know about chemistry, designing novel chemistry, and just putting new ideas … in front of chemists and drug hunters.”
Shaker says that examples of progress in AI-driven drug discovery include the design of kinase inhibitors by an AI system called Generative Tensorial Reinforcement Learning. The system, which predicted a molecule for a well-known fibrosis target in just 21 days, was developed by a team of researchers led by Insilico Medicine. The system’s code is now publicly available. It is part of a broader trend—the use of Generative Adversarial Networks in drug discovery applications.
Recently, Insilico announced the start of a Phase I clinical trial evaluating ISM001-055, an antifibrotic small-molecule inhibitor generated by the company’s AI-powered drug discovery platform for the treatment of idiopathic pulmonary fibrosis. Insilico indicated that the total time from target discovery program initiation to the start of Phase I took under 30 months.
Second-guessing first principles
In drug discovery, computational drug design is a valuable methodology, but it has its limitations. “To be sure, not all drug discovery is computational,” remarks Andreas Windemuth, PhD, chief innovation officer at Cyclica, a company that refers to itself as a “neo-biotech” that leverages AI and computational biophysics to “reshape” the drug discovery process. “Computational drug design is quite old,” he continues.
According to Windemuth, computational drug discovery has been preoccupied with docking, that is, with seeing where a protein and a small molecule might fit together. “That’s been the core of computational drug discovery so far, without AI, and it’s not very accurate,” he elaborates. “That’s first principles, and molecules do not always behave according to first principles. Well … they do. But there are many things that are missing from the equation.”
“That changed when deep learning came up,” Windemuth declares. Deep learning is at the heart of Cyclica’s MatchMaker technology, which was the subject of a recent blog post on the company’s website.
“With MatchMaker at hand, we were able to replace our reliance on conventional molecular docking in our flagship proteome screening platform Ligand Express,” the blog post detailed. “MatchMaker also plays a critical role in our newly launched Ligand Design technology for multi-objective drug design. Taken together, Ligand Design and Ligand Express, our first-generation off-target profiling platform, offer a unique end-to-end AI-augmented drug discovery platform to design advanced lead-like molecules while minimizing off-target effects.”
Turning specific details into generalizable rules
Molly Gibson, PhD, is the co-founder of Generate Biomedicines, a biotech company that uses a machine learning platform called Generative Biology to expedite the discovery of protein-based drugs. The platform, which leverages statistics to uncover patterns linking amino acid sequence, structure, and function, is designed to expand the available search space for novel biomedicines.
Gibson notes that conventional discovery methods have relied on trial and error to identify proteins that exist in nature. “These methods can only scratch the surface of what’s possible,” she says. “Think about a 100-amino-acid protein, which is not even a very big protein.” The combinatoric possibilities of such a protein are so vast, she continues, that few of them could have been realized in all of human evolution. Comparing the number of actualized (expressed) protein sequences to the number of potential protein sequences would be like comparing a drop of water to all of Earth’s oceans. “If you think about those numbers,” she argues, “it becomes almost impossible to believe that any of our protein therapeutics are actually optimized for their function today.”
Rather than try to search the essentially infinite space of protein primary sequences, Generative Biology and other companies that leverage AI are trying to build a generalizable model that can address one of the biggest problems in biology—knowing how a protein sequence will determine a protein’s function. Gibson likens the problem to teaching a computer to draw a human face. She says one can have the computer learn what a human face looks like by “looking at hundreds of millions of human faces,” or one can “tell the computer to draw a nose here and eyes here.”
Exploring RNA space
With the smashing success of the mRNA-based COVID-19 vaccines, RNA therapies have suddenly been thrust into the spotlight. And with that comes a huge influx of RNA biology data. This data, like data pertaining to proteins, for example, can be used to sharpen AI-driven drug development.
“You can measure ribosome occupancy, you can measure microRNA binding, and you can measure RBP [RNA-binding protein] binding—all these things are amenable to data generation at large scale,” says Amit Deshwar, PhD, vice president and head of predictive systems, Deep Genomics.
The RNA therapeutics space may be too vast to explore without an AI assist. “On the target identification side, it’s extremely challenging to do it without machine learning methods,” Deshwar continues. “There are too many variants to do a very classical association study when you look at whole genome sequencing data.”
For Deshwar and the RNA drug discovery field as a whole, using AI to discover RNA therapeutics is similar to using AI to discover protein therapeutics. You pick a target, and then you study it. When working to discover RNA therapeutics, however, you have to step back in time. According to Deshwar, you cannot begin by “identifying [the] protein you want to affect in order to prevent a disease.” Instead, your analysis extends back to an earlier stage in the process that culminates in protein expression.
Because Deep Genomics targets genetic mutations in rare diseases, it has to step back even further in time. The company notes that for a single disease, there may be thousands of disease-causing mutations to look at, and hundreds of potential fixes. “On top of that,” the company adds, “there may be hundreds of thousands to millions of potential drugs to search through, but only a few that work.”
“Out of all the possible genetic changes, only a few of them are meaningful in affecting your disease risk,” Deshwar emphasizes. “Our machine learning helps us narrow down all the possible associations, which are the ones that are actually preventing you from getting disease.”
Ultimately, says Deshwar, machine learning approaches remain similar across therapeutic modalities. Whether you’re synthesizing small molecules or RNA oligonucleotides, you’re using AI technology to recognize patterns in large datasets.
Retaining the human element
“One of the biggest challenges with thinking about AI in drug development is right now it’s being used a lot to augment our existing processes,” says Generate Biomedicines’ Gibson. “AI and machine learning will be most impactful when we actually design systems from the ground up … so that the processes we use are most optimized for machine learning.”
To suggest how AI could expand drug discovery possibilities rather than just reinforce old practices, Gibson offers the following scenario: “Instead of screening molecules and hoping that I get one at the end, I am learning principles to engineer a molecule from the ground up. And every time I generate a piece of data, that piece of data is not just used for that program to optimize that molecule, it’s also used to make every single next drug better.”
As AI-driven drug discovery advances along with laboratory automation technology, wet lab biologists and medicinal chemists may wonder if they’ll be displaced. They needn’t worry, suggests Neil Thompson, PhD, the chief scientific officer at Healx. “Change is generally incremental,” he points out. “People and things develop slowly. It is really the interplay between what the machine does well and what the human [does well] that is key here.” He adds that very few processes remove human interpretation completely.
What is changing, according to Thompson, is the mindset of the drug discoverer. At present, researchers commonly rely on phenotypic screening to move a drug forward, so long as the drug works. Often, the researcher won’t know the mechanism of action. With AI-driven drug discovery, researchers may first identify a target, and then design a drug accordingly. “But you’re making a big assumption on whether that target is involved in your disease,” Thompson cautions. “Often that validation is incomplete. And for that sort of reason, still, a lot of drugs fail in the clinic.”
Perhaps, as Thompson suggests, established drug discovery practices will persist alongside AI-driven drug discovery. “Not all drug discovery is computational,” Windemuth adds. “There’s also just trial and error.”