H2CO Dataset
The deposited data sets were used to compare three state-of-the art machine learning (ML) approaches to obtain representations of potential energy surfaces (PESs). The comparison is meant to be representative as it examines a purely kernel-based approach (reproducing kernel Hilbert space plus forces...
Main Authors: | , , , , |
---|---|
Format: | Other/Unknown Material |
Language: | English |
Published: |
Zenodo
2020
|
Subjects: | |
Online Access: | https://doi.org/10.5281/zenodo.3923823 |
_version_ | 1821678164928102400 |
---|---|
author | Käser, Silvan Koner, Debasish Christensen, Anders S. von Lilienfeld, O. Anatole Meuwly, Markus |
author_facet | Käser, Silvan Koner, Debasish Christensen, Anders S. von Lilienfeld, O. Anatole Meuwly, Markus |
author_sort | Käser, Silvan |
collection | Zenodo |
description | The deposited data sets were used to compare three state-of-the art machine learning (ML) approaches to obtain representations of potential energy surfaces (PESs). The comparison is meant to be representative as it examines a purely kernel-based approach (reproducing kernel Hilbert space plus forces (RKHS+F))[1], a purely neural network based approach (PhysNet)[2] and includes the FCHL representation [3] within kernel ridge regression. Formaldehyde, H2CO, is used as a benchmark system. H2CO is a small molecule for which PESs can be calculated at different levels of theory and, thus, suitable for an in-depth theoretical study. Also, very high-level calculations have already been presented (see e.g. Ref. [4]) and experimental reference data is available to compare with [5]. Using reference data calculated at three different levels of quantum chemical theory (B3LYP/cc-pVDZ, MP2/aug-cc-pVTZ and CCSD(T)-F12/aug-cc-pVTZ-F12) ML models are trained using the different ML methods. The performance of the models is then examined by considering energy and force learning curves, harmonic frequencies and IR spectra from finite-Temperature molecular dynamics (MD) simulations. The data sets contain different geometries for the H2CO molecule generated using the normal mode sampling approach [6] performed at different temperatures. Four data sets are deposited: i) "h2co_B3LYP_cc-pVDZ_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using ORCA [7] (B3LYP/cc-pVDZ). ii) "h2co_mp2_avtz_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (MP2/aug-cc-pVTZ). iii) "h2co_ccsdt_avtz_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (CCSD(T)-F12/aug-cc-pVTZ-F12). iv) "h2co_ccsdt_avtz_2500_extrapol.npz": 2500 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (CCSD(T)-F12/aug-cc-pVTZ-F12). This sampling was carried out at higher temperature (5000 K ... |
format | Other/Unknown Material |
genre | Orca |
genre_facet | Orca |
id | ftzenodo:oai:zenodo.org:3923823 |
institution | Open Polar |
language | English |
op_collection_id | ftzenodo |
op_doi | https://doi.org/10.5281/zenodo.392382310.5281/zenodo.3923822 |
op_relation | https://arxiv.org/abs/arXiv:2006.16752 https://doi.org/10.5281/zenodo.3923822 https://doi.org/10.5281/zenodo.3923823 oai:zenodo.org:3923823 |
op_rights | info:eu-repo/semantics/openAccess Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode |
publishDate | 2020 |
publisher | Zenodo |
record_format | openpolar |
spelling | ftzenodo:oai:zenodo.org:3923823 2025-01-17T00:10:32+00:00 H2CO Dataset Käser, Silvan Koner, Debasish Christensen, Anders S. von Lilienfeld, O. Anatole Meuwly, Markus 2020-06-30 https://doi.org/10.5281/zenodo.3923823 eng eng Zenodo https://arxiv.org/abs/arXiv:2006.16752 https://doi.org/10.5281/zenodo.3923822 https://doi.org/10.5281/zenodo.3923823 oai:zenodo.org:3923823 info:eu-repo/semantics/openAccess Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode Machine Learning Formaldehyde Neural Network Quantum Chemistry Potential Energy Surface info:eu-repo/semantics/other 2020 ftzenodo https://doi.org/10.5281/zenodo.392382310.5281/zenodo.3923822 2024-12-06T05:34:54Z The deposited data sets were used to compare three state-of-the art machine learning (ML) approaches to obtain representations of potential energy surfaces (PESs). The comparison is meant to be representative as it examines a purely kernel-based approach (reproducing kernel Hilbert space plus forces (RKHS+F))[1], a purely neural network based approach (PhysNet)[2] and includes the FCHL representation [3] within kernel ridge regression. Formaldehyde, H2CO, is used as a benchmark system. H2CO is a small molecule for which PESs can be calculated at different levels of theory and, thus, suitable for an in-depth theoretical study. Also, very high-level calculations have already been presented (see e.g. Ref. [4]) and experimental reference data is available to compare with [5]. Using reference data calculated at three different levels of quantum chemical theory (B3LYP/cc-pVDZ, MP2/aug-cc-pVTZ and CCSD(T)-F12/aug-cc-pVTZ-F12) ML models are trained using the different ML methods. The performance of the models is then examined by considering energy and force learning curves, harmonic frequencies and IR spectra from finite-Temperature molecular dynamics (MD) simulations. The data sets contain different geometries for the H2CO molecule generated using the normal mode sampling approach [6] performed at different temperatures. Four data sets are deposited: i) "h2co_B3LYP_cc-pVDZ_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using ORCA [7] (B3LYP/cc-pVDZ). ii) "h2co_mp2_avtz_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (MP2/aug-cc-pVTZ). iii) "h2co_ccsdt_avtz_4001.npz": 4001 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (CCSD(T)-F12/aug-cc-pVTZ-F12). iv) "h2co_ccsdt_avtz_2500_extrapol.npz": 2500 geometries of H2CO generated using normal mode sampling and calculated using MOLPRO 2019 [8] (CCSD(T)-F12/aug-cc-pVTZ-F12). This sampling was carried out at higher temperature (5000 K ... Other/Unknown Material Orca Zenodo |
spellingShingle | Machine Learning Formaldehyde Neural Network Quantum Chemistry Potential Energy Surface Käser, Silvan Koner, Debasish Christensen, Anders S. von Lilienfeld, O. Anatole Meuwly, Markus H2CO Dataset |
title | H2CO Dataset |
title_full | H2CO Dataset |
title_fullStr | H2CO Dataset |
title_full_unstemmed | H2CO Dataset |
title_short | H2CO Dataset |
title_sort | h2co dataset |
topic | Machine Learning Formaldehyde Neural Network Quantum Chemistry Potential Energy Surface |
topic_facet | Machine Learning Formaldehyde Neural Network Quantum Chemistry Potential Energy Surface |
url | https://doi.org/10.5281/zenodo.3923823 |