When chemists devise novel chemical reactions, a valuable piece of knowledge pertains to the transition state of the reaction — the irreversible juncture from which a reaction must advance.
This insight enables chemists to strive for optimal conditions that will facilitate the desired reaction. Nevertheless, existing techniques for forecasting the transition state and the trajectory a chemical reaction will follow are intricate and demand substantial computational resources.
Researchers at MIT have now created a machine-learning model that can make these forecasts in under a second, with remarkable precision. Their model has the potential to simplify the task of designing chemical reactions that could yield a variety of valuable compounds, such as medicines or fuels.
“Our goal is ultimately to engineer processes that convert abundant natural resources into essential molecules, such as materials and therapeutic agents. Computational chemistry plays a crucial role in determining how to create more sustainable methods to transition from reactants to products,” states Heather Kulik, the Lammot du Pont Professor of Chemical Engineering, a professor of chemistry, and the lead author of the recent research.
Former MIT graduate student Chenru Duan PhD ’22, currently at Deep Principle; former Georgia Tech graduate student Guan-Horng Liu, now at Meta; and Cornell University graduate student Yuanqi Du are the primary authors of the article, which is published today in Nature Machine Intelligence.
Enhanced estimates
For any given chemical reaction to transpire, it must navigate through a transition state, occurring when it attains the energy threshold necessary to progress. These transition states are so transient that they are nearly impossible to visualize experimentally.
As a substitute, scientists can derive the geometries of transition states employing methods grounded in quantum chemistry. However, this approach necessitates a considerable amount of computational power and can require hours or days to compute a single transition state.
“Ideally, we aspire to utilize computational chemistry to design more sustainable processes, yet this computational effort is itself a significant use of energy and resources in locating these transition states,” Kulik remarks.
In 2023, Kulik, Duan, and colleagues reported a machine-learning approach they formulated to anticipate the transition states of reactions. This approach is quicker than applying quantum chemistry techniques, but it still falls short of the ideal since it necessitates the model to create around 40 structures and subsequently process those predictions through a “confidence model” to ascertain which states were most likely to occur.
A contributing factor to the numerous iterations of that model is its reliance on randomly generated inputs for the initial point of the transition state structure, followed by multiple calculations until it arrives at its final, most accurate prediction. These randomly generated starting points may significantly diverge from the genuine transition state, necessitating numerous steps.
The researchers’ novel model, React-OT, articulated in the Nature Machine Intelligence article, employs an alternative method. They trained their model to commence from an estimate of the transition state created through linear interpolation — a technique that approximates each atom’s position by relocating it halfway between its placement in the reactants and the products within three-dimensional space.
“A linear estimate serves as an effective starting point for gauging where that transition state might end up,” Kulik explains. “The model’s approach begins from a significantly improved initial guess as opposed to merely a completely random guess, unlike the previous work.”
Due to this strategy, the model requires fewer iterations and less time to generate a prediction. In the recent study, the researchers demonstrated that their model could produce predictions in roughly five iterations, taking approximately 0.4 seconds. These forecasts do not require processing through a confidence model and are about 25 percent more precise than those yielded by the prior model.
“This significantly renders React-OT a practical model that we can seamlessly incorporate into the existing computational workflow for high-throughput screening to devise optimal transition state structures,” Duan states.
“A broad spectrum of chemistry”
To develop React-OT, the researchers trained it using the same dataset applied for their earlier model. This dataset consists of structures of reactants, products, and transition states calculated using quantum chemistry methods for 9,000 distinct chemical reactions, predominantly involving small organic or inorganic molecules.
Upon training, the model exhibited impressive performance on other reactions from this dataset that had been withheld from the training data. It also excelled in making accurate predictions concerning reactions with larger reactants, which frequently include side chains not directly involved in the reaction.
“This is significant because numerous polymerization reactions involve a large macromolecule, while the reaction takes place in just one segment. Possessing a model that generalizes across different system sizes indicates it can address a wide range of chemistry,” Kulik states.
The researchers are currently focusing on training the model to forecast transition states for reactions involving molecules that comprise additional elements, such as sulfur, phosphorus, chlorine, silicon, and lithium.
“Rapidly predicting transition state structures is essential to all chemical comprehension,” says Markus Reiher, a professor of theoretical chemistry at ETH Zurich, who was not part of the research. “The novel methodology introduced in the paper could significantly accelerate our search and optimization efforts, leading us to our final results more swiftly. Consequently, less energy will be consumed in these high-performance computing initiatives. Any advancement that hastens this optimization is advantageous for all types of computational chemical research.”
The MIT team hopes that other researchers will utilize their methodology in creating their own reactions and has established an application for that purpose.
The research received funding from the U.S. Army Research Office, the U.S. Department of Defense Basic Research Office, the U.S. Air Force Office of Scientific Research, the National Science Foundation, and the U.S. Office of Naval Research.