H2CO Dataset

The deposited data sets were used to compare three state-of-the art machine learning (ML) approaches to obtain representations of potential energy surfaces (PESs). The comparison is meant to be representative as it examines a purely kernel-based approach (reproducing kernel Hilbert space plus forces...

Full description

Bibliographic Details
Main Authors: Käser, Silvan, Koner, Debasish, Christensen, Anders S., von Lilienfeld, O. Anatole, Meuwly, Markus
Format: Other/Unknown Material
Language:English
Published: Zenodo 2020
Subjects:
Online Access:https://doi.org/10.5281/zenodo.3923823
_version_ 1821678164928102400
author Käser, Silvan
Koner, Debasish
Christensen, Anders S.
von Lilienfeld, O. Anatole
Meuwly, Markus
author_facet Käser, Silvan
Koner, Debasish
Christensen, Anders S.
von Lilienfeld, O. Anatole
Meuwly, Markus
author_sort Käser, Silvan
collection Zenodo
description The deposited data sets were used to compare three state-of-the art machine learning (ML) approaches to obtain representations of potential energy surfaces (PESs). The comparison is meant to be representative as it examines a purely kernel-based approach (reproducing kernel Hilbert space plus forces (RKHS+F))[1], a purely neural network based approach (PhysNet)[2] and includes the FCHL representation [3] within kernel ridge regression. Formaldehyde, H2CO, is used as a benchmark system. H2CO is a small molecule for which PESs can be calculated at different levels of theory and, thus, suitable for an in-depth theoretical study. Also, very high-level calculations have already been presented (see e.g. Ref. [4]) and experimental reference data is available to compare with [5]. Using reference data calculated at three different levels of quantum chemical theory (B3LYP/cc-pVDZ, MP2/aug-cc-pVTZ and CCSD(T)-F12/aug-cc-pVTZ-F12) ML models are trained using the different ML methods. The performance of the models is then examined by considering energy and force learning curves, harmonic frequencies and IR spectra from finite-Temperature molecular dynamics (MD) simulations. The data sets contain different geometries for the H2CO molecule generated using the normal mode sampling approach [6] performed at different temperatures. Four data sets are deposited: i) "h2co_B3LYP_cc-pVDZ_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using ORCA [7] (B3LYP/cc-pVDZ). ii) "h2co_mp2_avtz_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (MP2/aug-cc-pVTZ). iii) "h2co_ccsdt_avtz_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (CCSD(T)-F12/aug-cc-pVTZ-F12). iv) "h2co_ccsdt_avtz_2500_extrapol.npz": 2500 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (CCSD(T)-F12/aug-cc-pVTZ-F12). This sampling was carried out at higher temperature (5000 K ...
format Other/Unknown Material
genre Orca
genre_facet Orca
id ftzenodo:oai:zenodo.org:3923823
institution Open Polar
language English
op_collection_id ftzenodo
op_doi https://doi.org/10.5281/zenodo.392382310.5281/zenodo.3923822
op_relation https://arxiv.org/abs/arXiv:2006.16752
https://doi.org/10.5281/zenodo.3923822
https://doi.org/10.5281/zenodo.3923823
oai:zenodo.org:3923823
op_rights info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
publishDate 2020
publisher Zenodo
record_format openpolar
spelling ftzenodo:oai:zenodo.org:3923823 2025-01-17T00:10:32+00:00 H2CO Dataset Käser, Silvan Koner, Debasish Christensen, Anders S. von Lilienfeld, O. Anatole Meuwly, Markus 2020-06-30 https://doi.org/10.5281/zenodo.3923823 eng eng Zenodo https://arxiv.org/abs/arXiv:2006.16752 https://doi.org/10.5281/zenodo.3923822 https://doi.org/10.5281/zenodo.3923823 oai:zenodo.org:3923823 info:eu-repo/semantics/openAccess Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode Machine Learning Formaldehyde Neural Network Quantum Chemistry Potential Energy Surface info:eu-repo/semantics/other 2020 ftzenodo https://doi.org/10.5281/zenodo.392382310.5281/zenodo.3923822 2024-12-06T05:34:54Z The deposited data sets were used to compare three state-of-the art machine learning (ML) approaches to obtain representations of potential energy surfaces (PESs). The comparison is meant to be representative as it examines a purely kernel-based approach (reproducing kernel Hilbert space plus forces (RKHS+F))[1], a purely neural network based approach (PhysNet)[2] and includes the FCHL representation [3] within kernel ridge regression. Formaldehyde, H2CO, is used as a benchmark system. H2CO is a small molecule for which PESs can be calculated at different levels of theory and, thus, suitable for an in-depth theoretical study. Also, very high-level calculations have already been presented (see e.g. Ref. [4]) and experimental reference data is available to compare with [5]. Using reference data calculated at three different levels of quantum chemical theory (B3LYP/cc-pVDZ, MP2/aug-cc-pVTZ and CCSD(T)-F12/aug-cc-pVTZ-F12) ML models are trained using the different ML methods. The performance of the models is then examined by considering energy and force learning curves, harmonic frequencies and IR spectra from finite-Temperature molecular dynamics (MD) simulations. The data sets contain different geometries for the H2CO molecule generated using the normal mode sampling approach [6] performed at different temperatures. Four data sets are deposited: i) "h2co_B3LYP_cc-pVDZ_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using ORCA [7] (B3LYP/cc-pVDZ). ii) "h2co_mp2_avtz_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (MP2/aug-cc-pVTZ). iii) "h2co_ccsdt_avtz_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (CCSD(T)-F12/aug-cc-pVTZ-F12). iv) "h2co_ccsdt_avtz_2500_extrapol.npz": 2500 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (CCSD(T)-F12/aug-cc-pVTZ-F12). This sampling was carried out at higher temperature (5000 K ... Other/Unknown Material Orca Zenodo
spellingShingle Machine Learning
Formaldehyde
Neural Network
Quantum Chemistry
Potential Energy Surface
Käser, Silvan
Koner, Debasish
Christensen, Anders S.
von Lilienfeld, O. Anatole
Meuwly, Markus
H2CO Dataset
title H2CO Dataset
title_full H2CO Dataset
title_fullStr H2CO Dataset
title_full_unstemmed H2CO Dataset
title_short H2CO Dataset
title_sort h2co dataset
topic Machine Learning
Formaldehyde
Neural Network
Quantum Chemistry
Potential Energy Surface
topic_facet Machine Learning
Formaldehyde
Neural Network
Quantum Chemistry
Potential Energy Surface
url https://doi.org/10.5281/zenodo.3923823