Wailord: Parsers and Reproducibility for Quantum Chemistry

Much of the scientific python ecosystem deals with problems at the level when their structure is already present in memory. However, the generation of input files for driving existing codes, as well as the parsing of results is not typically covered in great detail. This presentation bridges the gap...

Full description

Bibliographic Details
Main Author: Rohit Goswami
Format: Conference Object
Language:unknown
Published: Zenodo 2022
Subjects:
Online Access:https://doi.org/10.5281/zenodo.7325038
id ftzenodo:oai:zenodo.org:7325038
record_format openpolar
spelling ftzenodo:oai:zenodo.org:7325038 2024-09-15T18:29:00+00:00 Wailord: Parsers and Reproducibility for Quantum Chemistry Rohit Goswami 2022-11-15 https://doi.org/10.5281/zenodo.7325038 unknown Zenodo https://doi.org/10.5281/zenodo.7325037 https://doi.org/10.5281/zenodo.7325038 oai:zenodo.org:7325038 info:eu-repo/semantics/openAccess Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode SciPy 2022, Austin, Texas, USA, 11-17 July 2022 parsers computational-chemistry python reproducible reports quantum chemistry info:eu-repo/semantics/conferencePoster 2022 ftzenodo https://doi.org/10.5281/zenodo.732503810.5281/zenodo.7325037 2024-07-26T19:34:29Z Much of the scientific python ecosystem deals with problems at the level when their structure is already present in memory. However, the generation of input files for driving existing codes, as well as the parsing of results is not typically covered in great detail. This presentation bridges the gap between external programs and data-structures, demonstrating via a practical example, the utility of code-generation and parsing expression grammar parsers for reproducible results in quantum chemistry. More details at: https://rgoswami.me/posts/scipycon-2022-meta The concept of a crisis of reproducibility in scientific research needs no introduction. Although there are several tooling approaches on can take to reduce the cognitive load of keeping track of various steps of an analysis pipeline [1], there remains an almost linguistic gap when it comes to interfacing with domain specific tools. We demonstrate the role of parsers in the reproducibility workflow. By focusing on the generation of input files and the structured extraction of output data, we will aim to plug a gap in the generation of reproducible reports, namely, interfacing (via file I/O) with existing software. The file I/O interface justifiably has many detractors, especially on an HPC (high performance computing) cluster, I/O can be a bottleneck. However, when faced with an opaque binary which outputs freeform results, powered by an input file which has little to no structure beyond a 1500 page manual of keyword arguments, the utility of a domain specific parser can pay off immensely. In our quest to translate domain intuition into computational input constraints, we will work in a reduced grammar, an intermediate representation (IR). Such an IR can be generated for multiple program specifications, so extensions to other software is not difficult either. As a concrete realization of an abstract concept, we will discuss Wailord [2], which uses parsimonious [3] and cookiecutter [4] to interface with ORCA [5], a popular free (but not open source) ... Conference Object Orca Zenodo
institution Open Polar
collection Zenodo
op_collection_id ftzenodo
language unknown
topic parsers
computational-chemistry
python
reproducible reports
quantum chemistry
spellingShingle parsers
computational-chemistry
python
reproducible reports
quantum chemistry
Rohit Goswami
Wailord: Parsers and Reproducibility for Quantum Chemistry
topic_facet parsers
computational-chemistry
python
reproducible reports
quantum chemistry
description Much of the scientific python ecosystem deals with problems at the level when their structure is already present in memory. However, the generation of input files for driving existing codes, as well as the parsing of results is not typically covered in great detail. This presentation bridges the gap between external programs and data-structures, demonstrating via a practical example, the utility of code-generation and parsing expression grammar parsers for reproducible results in quantum chemistry. More details at: https://rgoswami.me/posts/scipycon-2022-meta The concept of a crisis of reproducibility in scientific research needs no introduction. Although there are several tooling approaches on can take to reduce the cognitive load of keeping track of various steps of an analysis pipeline [1], there remains an almost linguistic gap when it comes to interfacing with domain specific tools. We demonstrate the role of parsers in the reproducibility workflow. By focusing on the generation of input files and the structured extraction of output data, we will aim to plug a gap in the generation of reproducible reports, namely, interfacing (via file I/O) with existing software. The file I/O interface justifiably has many detractors, especially on an HPC (high performance computing) cluster, I/O can be a bottleneck. However, when faced with an opaque binary which outputs freeform results, powered by an input file which has little to no structure beyond a 1500 page manual of keyword arguments, the utility of a domain specific parser can pay off immensely. In our quest to translate domain intuition into computational input constraints, we will work in a reduced grammar, an intermediate representation (IR). Such an IR can be generated for multiple program specifications, so extensions to other software is not difficult either. As a concrete realization of an abstract concept, we will discuss Wailord [2], which uses parsimonious [3] and cookiecutter [4] to interface with ORCA [5], a popular free (but not open source) ...
format Conference Object
author Rohit Goswami
author_facet Rohit Goswami
author_sort Rohit Goswami
title Wailord: Parsers and Reproducibility for Quantum Chemistry
title_short Wailord: Parsers and Reproducibility for Quantum Chemistry
title_full Wailord: Parsers and Reproducibility for Quantum Chemistry
title_fullStr Wailord: Parsers and Reproducibility for Quantum Chemistry
title_full_unstemmed Wailord: Parsers and Reproducibility for Quantum Chemistry
title_sort wailord: parsers and reproducibility for quantum chemistry
publisher Zenodo
publishDate 2022
url https://doi.org/10.5281/zenodo.7325038
genre Orca
genre_facet Orca
op_source SciPy 2022, Austin, Texas, USA, 11-17 July 2022
op_relation https://doi.org/10.5281/zenodo.7325037
https://doi.org/10.5281/zenodo.7325038
oai:zenodo.org:7325038
op_rights info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
op_doi https://doi.org/10.5281/zenodo.732503810.5281/zenodo.7325037
_version_ 1810470419070713856