Speaker
Description
The combination of modern machine learning (ML) approaches with high-quality data from quantum mechanical (QM) calculations can yield models with unrivaled accuracy/cost ratios. However, such methods are ultimately limited by the computational effort required to produce the reference data. In particular, reference calculations for periodic systems with many atoms can become prohibitively expensive for higher levels of theory.
This trade-off is critical in the context of organic crystal structure prediction (CSP). Here, the challenge lies in distinguishing a set of possible crystal configurations according to their stabilities, which generally means that a large number of accurate electronic structure calculations need to be performed. In this context, efficient ML models would be highly desirable, provided that the costs of generating the training data can be kept low.
To this end, we have developed a data-efficient framework for generating such models. This enables the fast screening of a wide range of crystal candidates for a given system, while accurately describing the subtle interplay between intermolecular interactions such as H-bonding and many-body dispersion effects, which determine the stability of molecular crystals [1]. We achieve this by enhancing a physics-based description of long-range interactions at the density functional tight binding (DFTB) level (for which an efficient implementation is available) with a short-range ML model trained on high-quality first-principles reference data, in a so-called Δ-ML approach. A further reduction in training costs was obtained by treating inter- and intramolecular interactions with separate corrections.
The resulting workflow is broadly applicable to different molecular materials, without the need for a single periodic calculation at the reference level of theory. Interestingly, this even allows the use of highly accurate wavefunction methods in CSP. This work was recently further extended to co-crystals, consisting of several distinct building blocks. Here, the ∆-ML models proved even more beneficial, given the large search space spanned by these systems [2]. Overall, this methodology is thus posed to significantly improve the efficiency and accuracy of CSP, when combined with state-of-the-art search algorithms.
[1] S. Wengert, G. Csányi, K. Reuter, and J.T. Margraf, Chem. Sci. 12, 4536 (2021).
[2] S. Wengert, G. Csányi, K. Reuter, and J.T. Margraf, J. Chem. Theory Comput. 18, 4586 (2022).
Abstract Number (department-wise) | TH 02 |
---|---|
Department | TH (Reuter) |