Sami-Trop: 12-lead ECG traces with age and mortality annotations

SaMi-Trop is an NIH-funded prospective cohort of 1959 patients with chronic Chagas cardiomyopathy to evaluate whether a clinical prediction rule based on ECG, brain natriuretic peptide (BNP) levels, and other biomarkers can be useful in clinical practice. A subset of the SaMi-Trop dataset with annot...

Full description

Bibliographic Details
Main Authors: Ribeiro, Antonio Luiz P., Ribeiro, Antônio H., Paixao, Gabriela M.M., Lima, Emilly M., Horta Ribeiro, Manoel, Pinto Filho, Marcelo M., Gomes, Paulo R., Oliveira, Derick M., Meira Jr, Wagner, Schon, Thömas B, Sabino, Ester C
Format: Dataset
Language:unknown
Published: Zenodo 2021
Subjects:
Online Access:https://dx.doi.org/10.5281/zenodo.4905618
https://zenodo.org/record/4905618
Description
Summary:SaMi-Trop is an NIH-funded prospective cohort of 1959 patients with chronic Chagas cardiomyopathy to evaluate whether a clinical prediction rule based on ECG, brain natriuretic peptide (BNP) levels, and other biomarkers can be useful in clinical practice. A subset of the SaMi-Trop dataset with annotations of age and mortality and the correspondent ECG traces is openly available here. Contain two files `exams.csv` and `exams.hdf5`. The files contain information about the first ECG exam taken by 1631 patients. "exams.csv": is a comma-separated values (csv) file containing the columns "exam_id": id used for internal usages; "age": patient age in years at the moment the of the exam; "is_male": true if the patient is male, false if the patient is female; "normal_ecg": True if the patient has a normal ECG; "death": true if the patient dies in the follow-up time "timey": if the patient dies it is the time to the death of the patient. If not, it is the follow-up time "exams.hdf5": The HDF5 file containing a single dataset named `tracings`. This dataset is a `(1631, 4096, 12)` tensor. The first dimension corresponds to the 1631 different exams; the second dimension corresponds to the 4096 signal samples; the third dimension to the 12 different leads of the ECG exams in the following order: `{DI, DII, DIII, AVR, AVL, AVF, V1, V2, V3, V4, V5, V6}`. The signals are sampled at 400 Hz. Some signals originally have a duration of 10 seconds (10 * 400 = 4000 samples) and others of 7 seconds (7 * 400 = 2800 samples). In order to make them all have the same size (4096 samples), we fill them with zeros on both sizes. For instance, for a 7 seconds ECG signal with 2800 samples we include 648 samples at the beginning and 648 samples at the end, yielding 4096 samples that are then saved in the hdf5 dataset. The relation between neural-network predicted age and mortality is established in: "Deep neural network estimated electrocardiographic-age as a mortality predictor" Emilly M Lima, Antônio H Ribeiro, Gabriela MM Paixão, Manoel Horta Ribeiro, Marcelo M Pinto Filho, Paulo R Gomes, Derick M Oliveira, Ester C Sabino, Bruce B Duncan, Luana Giatti, Sandhi M Barreto, Wagner Meira Jr, Thomas B Schön, Antonio Luiz P Ribeiro. MedRXiv (2021) https://www.doi.org/10.1101/2021.02.19.21251232 The companion code can be found in: https://github.com/antonior92/ecg-age-prediction The SaMi-Trop dataset is described in: "Longitudinal study of patients with chronic Chagas cardiomyopathy in Brazil (SaMi-Trop project): a cohort profile" Clareci Silva Cardoso, Ester Cerdeira Sabino, Claudia Di Lorenzo Oliveira, Lea Campos de Oliveira, Ariela Mota Ferreira, Edécio Cunha-Neto, Ana Luiza Bierrenbach, João Eduardo Ferreira, Desirée Sant'Ana Haikal, Arthur L Reingold, Antonio Luiz P Ribeiro. BMJ Open (2016);6:e011181. doi: 10.1136/bmjopen-2016-011181