The world’s aging population, the rise of chronic and infectious diseases, and the emergence of new pathogens have made the need for new treatments more urgent than ever. However, discovering a new drug and bringing it to market is a long, difficult, and expensive journey marked by many failures and few successes.
Artificial intelligence has long been considered the answer to overcoming some of these obstacles because of its ability to analyze large amounts of data, uncover patterns and relationships, and predict outcomes.
But despite its enormous potential, AI still holds the promise of revolutionizing drug discovery.
Now a multi-institutional team led by Harvard Medical School biomedical scientist Marinka Zitnik has launched a platform that aims to improve AI-driven drug discovery by building realistic datasets and more reliable algorithms.
Therapeutics Data Commons, described in recent comments on Nature Chemical Biologyan open access platform that acts as a bridge between computer scientists and machine learning researchers on the one hand and biomedical researchers, chemists, medical researchers, and drug designers on the other—communities that have traditionally worked in isolation from each other. .
The platform provides both data set processing and algorithm design and performance evaluation of multiple therapies—including small-molecule drugs, antibodies, and cell and gene therapy—at all stages of drug development, from chemical discovery to drug performance in clinical trials.
Zitnik, assistant professor of biomedical informatics at the Blavatnik Institute at HMS, developed the concept of the platform and is now leading the project in collaboration with researchers at MIT, Stanford University, Carnegie Mellon University, Georgia Tech, University of Illinois-Urbana Champaign, and Cornell University . .
He recently discussed Therapeutics Data Commons with Harvard Medicine News.
HMNews: What are the key challenges in drug discovery and how can AI help solve these?
Zitnik: Developing a drug from scratch that is safe and effective is an incredible challenge. On average, it takes anywhere between 11 and 16 years and between $1 billion and $2 billion to do so. Why is that?
It is very difficult to determine early on whether a promising chemical can produce results in human patients that match the results it shows in the laboratory. The number of small molecule compounds is 10 to the 60 power—but only a fraction of this astronomical chemical space has been tapped for molecules with therapeutic properties. Despite that, the impact of existing treatments on disease has been impressive. We believe that novel algorithms coupled with automation and new data sets can discover many molecules that can be translated into improving human health.
AI algorithms can help us determine which of these molecules might be safe and effective therapies. That is a major problem that the development of drug discovery suffers from. Our vision is that machine learning models can help sort through and synthesize large amounts of biochemical data that we can directly link to molecular and genetic information, and ultimately to individual patient outcomes.
HMNews: How close is AI to making this promise a reality?
Zitnik: We’re not there yet. There are several challenges, but I would say the biggest is understanding how our current algorithms work and how their performance translates to real-world problems.
When we test new AI models using computer modeling, we test them on benchmark datasets. Increasingly, we are seeing in the literature that those models are achieving almost identical accuracy. If so, why aren’t we seeing widespread adoption of machine learning in drug discovery?
This is because there is a large gap between performing well on a benchmark data set and being ready to transition to real-world use in a biomedical or clinical setting. The data on which these models are trained and tested do not reflect the type of challenges these models are exposed to when used in real practice, so closing this gap is very important.
HMNews: Where does the Therapeutics Data Commons platform come into this?
Zitnik: The goal of the Therapeutics Data Commons is to directly address such challenges. It serves as a meeting point between the machine learning community on one end and the biomedical community on the other. It can help the machine learning community with algorithmic innovation and make these models more translatable to real-world situations.
HMNews: Can you explain how it actually works?
Zitnik: First, remember that the drug discovery process covers the gamut from initial drug design based on data from chemistry and chemical biology, through preclinical research based on data from animal studies, and all the way to clinical trials in humans. patients. The machine learning models we train and test as part of the platform use different types of data to support the optimization process across these different stages.
For example, machine learning models that support small molecule drug design often rely on large data sets of molecular graphs—the structures of chemical compounds and their molecular properties. These models find patterns in a known chemical environment that relate parts of the chemical structure to the chemical properties necessary for the drug to be safe and effective.
Once an AI model is trained to identify these mythical patterns in a known set of chemicals, it can be deployed and look for similar patterns in many data sets of untested chemicals and predict how these chemicals might behave. .
To design models that can help with drug discovery in the near term, we train them with data from animal studies. These models are trained to look for patterns that relate biological data to potential clinical outcomes in humans.
And we can ask if the model can look for molecular signatures of chemical compounds associated with patient experience to identify which subset of patients are likely to respond to a chemical compound.
HMNews: Who are the contributors and end users of this platform?
Zitnik: We have a team of students, scientists, and expert volunteers from partner universities and industries, including small startups in the Boston area and large pharmaceutical companies in the United States and Europe. Computer scientists and medical researchers contribute their expertise in the form of sophisticated machine learning models and data sets that have been processed and curated, scaled in a form that is releasable and ready for use by others.
Therefore, the platform contains both analysis-ready data sets and machine learning algorithms, as well as robust metrics that tell us how well a machine learning model performs on a given data set.
Our end users are researchers from all over the world. We organize webinars to introduce any new features, get feedback, and answer questions. We offer lessons. This ongoing training and feedback is really important.
We have 4,000 to 5,000 active users every month, most of them from the US, Europe, and Asia. In total, we have seen over 65,000 downloads of our machine learning package/dataset. We’ve seen over 160,000 downloads of synchronized, standardized datasets. The numbers are growing, and we hope they will continue to grow.
HMNews: What are the long-term goals of the Therapeutics Data Commons?
Zitnik: Our goal is to support AI drug discovery in two areas. First, in the design and testing of machine learning methods in all stages of drug discovery and development, from chemical compound identification and drug development to clinical trials.
Second, to support the design and validation of machine learning algorithms across multiple therapies, especially new ones, including biologic products, vaccines, antibodies, mRNA therapies, protein therapies, and genetics.
There’s a huge opportunity for machine learning to contribute to those new therapies, and we haven’t seen the use of AI in those areas to the extent that we’ve seen in small molecule research, where most of the focus is today. This gap is mainly due to the lack of standardized AI-ready data for those novel therapies, which we hope to address with the Therapeutics Data Commons.
HMNews: What sparked your interest in this project?
Zitnik: I’ve always been interested in understanding and simulating interactions across complex systems, which are systems with many components that interact in an interdependent manner. As it turns out, many problems in medical science are, by definition, very complex systems.
We have a protein target which is a complex three-dimensional structure, we have a small molecule which is a complex graph of atoms and the bonds between those atoms, and then we have the patient, whose description and state of health is given a multi-dimensional representation form. This is a classic problem for a complex system, and I really love looking at and finding ways to balance and “smooth” those complex interactions.
Medical science is full of those kinds of problems that are ripe to benefit from machine learning. That’s what we’re chasing, that’s what we’re after.
Kexin Huang et al, the basis of Artificial intelligence for medical science, Nature Chemical Biology (2022). DOI: 10.1038/s41589-022-01131-2
Provided by Harvard Medical School
An excerpt: Can AI revolutionize the way we discover new drugs? (2022, November 16) retrieved November 16, 2022 from https://phys.org/news/2022-11-ai-drugs.html
This document is subject to copyright. Except for any fair dealings for the purpose of private study or research, no part may be reproduced without written permission. The content is provided for informational purposes only.