25–29 Apr 2022
At FHI (Dahlem) and IRIS (Adlershof)
Europe/Berlin timezone

Size-Extensive Machine Learning with Global Representations

Not scheduled
2h
At FHI (Dahlem) and IRIS (Adlershof)

At FHI (Dahlem) and IRIS (Adlershof)

Board: 06

Speaker

Hyunwook Jung

Description

Machine learning (ML) models are increasingly used in combination with electronic structure calculations to predict molecular properties at a much lower computational cost in high-throughput settings. Such ML models require representations that encode the molecular structure, which are generally designed to respect the symmetries and invariances of the target property. However, size-extensivity is usually not guaranteed for so-called global representations. In this contribution, we show how extensivity can be built into global ML models using, e.g., the Many-Body Tensor Representation. Properties of extensive and non-extensive models for the atomization energy are systematically explored by training on small molecules and testing on small, medium and large molecules.[1] Our results show that the non-extensive model is only useful in the size-range of its training set, whereas the extensive models provide reasonable predictions across large size differences. Remaining sources of error for the extensive models are discussed.

[1] ChemSystemsChem 2 (4), e1900052 (2020)

Primary authors

Presentation materials

There are no materials yet.