More than open data: towards a FAIR data and simulation infrastructure with AiiDA and Materials Cloud

High-throughput computational materials discovery studies can generate sheer amounts of interconnected data. Making such data open and FAIR is only possible with proper workflow tools that not only automate the simulations, but also deal appropriately with data management. I will discuss how we addr...

Full description

Bibliographic Details
Main Author: Pizzi, Giovanni
Format: Conference Object
Language:English
Published: 2023
Subjects:
Online Access:https://zenodo.org/record/8388969
https://doi.org/10.5281/zenodo.8388969
Description
Summary:High-throughput computational materials discovery studies can generate sheer amounts of interconnected data. Making such data open and FAIR is only possible with proper workflow tools that not only automate the simulations, but also deal appropriately with data management. I will discuss how we address these challenges with the open-source high-throughput infrastructure AiiDA [1], an automated and scalable solution for workflow management, data provenance storage and reproducibility, supported by a broad community of developers (over 120 different simulation engines are supported by AiiDA, see the AiiDA plugin registry [2]). After introducing AiiDA, I will discuss how we can achieve FAIR data sharing when combining AiiDA with our Materials Cloud platform [3]. I will then focus on how we can go beyond "just" open FAIR data, toward making also simulations FAIR. We target accessibility of advanced HPC workflows with the development of AiiDA common workflow (ACWF) interfaces, to perform routine material-science tasks with a single input/output interface [4]. We demonstrate how any user can perform in a reproducible way, e.g., structural relaxations via a single common interface with 11 different codes (Abinit, BigDFT, CASTEP, CP2K, FLEUR, Gaussian, NWChem, ORCA, Quantum ESPRESSO, Siesta VASP). These workflows only require the user to provide an input crystal or molecular structure, and all other numerical inputs are automatically selected appropriately for each code. Accessibility is then further increased combining AiiDA's workflows with the AiiDAlab graphical interface [5]. The value of the ACWF approach is demonstrated by our recent collaborative effort [6] to test precision and transferability of DFT methods across chemistries (all elements from hydrogen to curium, in 10 prototypical crystal structures), resulting in a reference dataset of 960 equations of state cross-checked between two all-electron codes, then used to verify and improve 9 other pseudopotential methods. I will conclude discussing efforts to ...