- Aug 4, 2021
MIT researchers have found ways to implement AI to fast forward the procedure of designing new molecules with precise results, in the field of medical research.
The research was a part of Machine Learning for Pharmaceutical Discovery and Synthesis that’s being carried out by MIT researchers in concert with eight medical firms. The newly developed AI model by the researchers can select lead molecules, boost the structure for achieving higher potency, and at the same time keep it chemically valid.
The consortium was announced in May and has identified one key challenge in drug discovery: lead optimization.
The motivation behind this was to replace the inefficient human modification process of designing molecules with automated iteration and assure the validity of the molecules we generate. – Wengong Jin, lead author of the model’s paper
Co-author of the paper Tommi S. Barzilay talked about what’s coming next: “The next step is to take this technology from academia to use on real pharmaceutical design cases, and demonstrate that it can assist human chemists in doing their work, which can be challenging.”
The model was trained across 250,000 molecular graphs derived from the ZINC database, that’s a collection of 3D molecular structures made publicly available. The tasks conducted were generating valid molecules, finding the best lead molecules, and designing novel molecules with enhanced potencies.
The results involved 100% chemically valid molecules generated from a sample distribution. Also, the potency level has been observed to be 30% higher than traditional systems. Finally, the model can create new molecules from modifying 800 molecules for higher potency while closely staying true to the lead molecular structure, and it averaged over 80% improvement in potency.
Next will come testing the model on properties beyond solubility, the ones that are more therapeutically relevant. This requires more data, as the researchers point out. They are planning to develop a new model to deploy with a limited amount of training data.