COVID-19 and the epistemology of epidemiological models at the dawn of AI

The models used to estimate disease transmission, susceptibility and severity determine what epidemiology can (and cannot tell) us about COVID-19. These include: ‘model organisms’ chosen for their phylogenetic/aetiological similarities; multivariable statistical models to estimate the strength/direc...

Full description

Bibliographic Details
Published in:	Annals of Human Biology
Main Author:	Ellison, George
Format:	Article in Journal/Newspaper
Language:	English
Published:	Taylor and Francis 2020
Subjects:	Public health engineering Machine learning Greenland Tennant
Online Access:	http://clok.uclan.ac.uk/35068/ http://clok.uclan.ac.uk/35068/1/200805%20COVID-19%20Commentary%20-%20Manuscript%20-%20PrePrints.org.pdf http://clok.uclan.ac.uk/35068/2/200804%20COVID-19%20Commentary%20-%20Supplementary%20Material%20-%20PrePrints.org.pdf https://doi.org/10.1080/03014460.2020.1839132

id	ftunivclancas:oai:clok.uclan.ac.uk:35068
record_format	openpolar
institution	Open Polar
collection	University of Central Lancashire: CLOK - Central Lancashire Online Knowledge
op_collection_id	ftunivclancas
language	English
topic	Public health engineering Machine learning
spellingShingle	Public health engineering Machine learning Ellison, George COVID-19 and the epistemology of epidemiological models at the dawn of AI
topic_facet	Public health engineering Machine learning
description	The models used to estimate disease transmission, susceptibility and severity determine what epidemiology can (and cannot tell) us about COVID-19. These include: ‘model organisms’ chosen for their phylogenetic/aetiological similarities; multivariable statistical models to estimate the strength/direction of (potentially causal) relationships between variables (through ‘causal inference’), and the (past/future) value of unmeasured variables (through ‘classification/prediction’); and a range of modelling techniques to predict beyond the available data (through ‘extrapolation’), compare different hypothetical scenarios (through ‘simulation’), and estimate key features of dynamic processes (through ‘projection’). Each of these models: address different questions using different techniques; involve assumptions that require careful assessment; and are vulnerable to generic and specific biases that can undermine the validity and interpretation of their findings. It is therefore necessary that the models used: can actually address the questions posed; and have been competently applied. In this regard, it is important to stress that extrapolation, simulation and projection cannot offer accurate predictions of future events when the underlying mechanisms (and the contexts involved) are poorly understood and subject to change. Given the importance of understanding such mechanisms/contexts, and the limited opportunity for experimentation during outbreaks of novel diseases, the use of multivariable statistical models to estimate the strength/direction of potentially causal relationships between two variables (and the biases incurred through their misapplication/misinterpretation) warrant particular attention. Such models must be carefully designed to address: ‘selection-collider bias’, ‘unadjusted confounding bias’ and ‘inferential mediator adjustment bias’ – all of which can introduce effects capable of enhancing, masking or reversing the estimated (true) causal relationship between the two variables examined. Selection-collider bias occurs when these two variables independently cause a third (the ‘collider’), and when this collider determines/reflects the basis for selection in the analysis. It is likely to affect all incompletely representative samples, although its effects will be most pronounced wherever selection is constrained (e.g. analyses focusing on infected/hospitalised individuals). Unadjusted confounding bias disrupts the estimated (true) causal relationship between two variables when: these share one (or more) common cause(s); and when the effects of these causes have not been adjusted for in the analyses (e.g. whenever confounders are unknown/unmeasured). Inferentially similar biases can occur when: one (or more) variable(s) (or ‘mediators’) fall on the causal path between the two variables examined (i.e. when such mediators are caused by one of the variables and are causes of the other); and when these mediators are adjusted for in the analysis. Such adjustment is commonplace when: mediators are mistaken for confounders; prediction models are mistakenly repurposed for causal inference; or mediator adjustment is used to estimate direct and indirect causal relationships (in a mistaken attempt at ‘mediation analysis’). These three biases are central to ongoing and unresolved epistemological tensions within epidemiology. All have substantive implications for our understanding of COVID-19, and the future application of artificial intelligence to ‘data-driven’ modelling of similar phenomena. Nonetheless, competently applied and carefully interpreted, multivariable statistical models may yet provide sufficient insight into mechanisms and contexts to permit more accurate projections of future disease outbreaks. 1. These biases, and the terminology involved, may be challenging to readers who are unfamiliar with the use of causal path diagrams (such as Directed Acyclic Graphs; DAGs) which have been instrumental in identifying the different roles that variables can play in causal processes (whether as ‘exposures’, ‘outcomes’, ‘confounders’, ‘mediators’, ‘colliders’, ‘competing exposures’ or ‘consequences of the outcome’) and revealing hitherto under-acknowledged sources of bias in analyses designed to support causal inference. For what we hoped might offer accessible introductions to DAGs (and how [not] to use these) please see: Ellison (2020); and Tennant et al. (2019). For more technical detail on ‘collider bias’, ‘unadjusted confounding bias’ and ‘inferential mediator adjustment bias’ (and its related concern, the ‘Table 2 fallacy’), please refer to: Cook and Ranstam 2017; Munafò et al. (2018); Tennant et al. (2017); VanderWeele and Arah (2011); and Westreich and Greenland (2013).
format	Article in Journal/Newspaper
author	Ellison, George
author_facet	Ellison, George
author_sort	Ellison, George
title	COVID-19 and the epistemology of epidemiological models at the dawn of AI
title_short	COVID-19 and the epistemology of epidemiological models at the dawn of AI
title_full	COVID-19 and the epistemology of epidemiological models at the dawn of AI
title_fullStr	COVID-19 and the epistemology of epidemiological models at the dawn of AI
title_full_unstemmed	COVID-19 and the epistemology of epidemiological models at the dawn of AI
title_sort	covid-19 and the epistemology of epidemiological models at the dawn of ai
publisher	Taylor and Francis
publishDate	2020
url	http://clok.uclan.ac.uk/35068/ http://clok.uclan.ac.uk/35068/1/200805%20COVID-19%20Commentary%20-%20Manuscript%20-%20PrePrints.org.pdf http://clok.uclan.ac.uk/35068/2/200804%20COVID-19%20Commentary%20-%20Supplementary%20Material%20-%20PrePrints.org.pdf https://doi.org/10.1080/03014460.2020.1839132
long_lat	ENVELOPE(-62.683,-62.683,-64.700,-64.700)
geographic	Greenland Tennant
geographic_facet	Greenland Tennant
genre	Greenland
genre_facet	Greenland
op_relation	http://clok.uclan.ac.uk/35068/1/200805%20COVID-19%20Commentary%20-%20Manuscript%20-%20PrePrints.org.pdf http://clok.uclan.ac.uk/35068/2/200804%20COVID-19%20Commentary%20-%20Supplementary%20Material%20-%20PrePrints.org.pdf Ellison, George orcid:0000-0001-8914-6812 (2020) COVID-19 and the epistemology of epidemiological models at the dawn of AI. Annals of Human Biology, 47 (6). pp. 506-513. ISSN 0301-4460 doi:10.1080/03014460.2020.1839132
op_rights	cc_by_nc_nd_4
op_rightsnorm	CC-BY-NC-ND
op_doi	https://doi.org/10.1080/03014460.2020.1839132
container_title	Annals of Human Biology
container_volume	47
container_issue	6
container_start_page	506
op_container_end_page	513
_version_	1766020521755934720
spelling	ftunivclancas:oai:clok.uclan.ac.uk:35068 2023-05-15T16:30:47+02:00 COVID-19 and the epistemology of epidemiological models at the dawn of AI Ellison, George 2020-11-23 application/pdf http://clok.uclan.ac.uk/35068/ http://clok.uclan.ac.uk/35068/1/200805%20COVID-19%20Commentary%20-%20Manuscript%20-%20PrePrints.org.pdf http://clok.uclan.ac.uk/35068/2/200804%20COVID-19%20Commentary%20-%20Supplementary%20Material%20-%20PrePrints.org.pdf https://doi.org/10.1080/03014460.2020.1839132 en eng Taylor and Francis http://clok.uclan.ac.uk/35068/1/200805%20COVID-19%20Commentary%20-%20Manuscript%20-%20PrePrints.org.pdf http://clok.uclan.ac.uk/35068/2/200804%20COVID-19%20Commentary%20-%20Supplementary%20Material%20-%20PrePrints.org.pdf Ellison, George orcid:0000-0001-8914-6812 (2020) COVID-19 and the epistemology of epidemiological models at the dawn of AI. Annals of Human Biology, 47 (6). pp. 506-513. ISSN 0301-4460 doi:10.1080/03014460.2020.1839132 cc_by_nc_nd_4 CC-BY-NC-ND Public health engineering Machine learning Article NonPeerReviewed 2020 ftunivclancas https://doi.org/10.1080/03014460.2020.1839132 2021-10-14T22:20:43Z The models used to estimate disease transmission, susceptibility and severity determine what epidemiology can (and cannot tell) us about COVID-19. These include: ‘model organisms’ chosen for their phylogenetic/aetiological similarities; multivariable statistical models to estimate the strength/direction of (potentially causal) relationships between variables (through ‘causal inference’), and the (past/future) value of unmeasured variables (through ‘classification/prediction’); and a range of modelling techniques to predict beyond the available data (through ‘extrapolation’), compare different hypothetical scenarios (through ‘simulation’), and estimate key features of dynamic processes (through ‘projection’). Each of these models: address different questions using different techniques; involve assumptions that require careful assessment; and are vulnerable to generic and specific biases that can undermine the validity and interpretation of their findings. It is therefore necessary that the models used: can actually address the questions posed; and have been competently applied. In this regard, it is important to stress that extrapolation, simulation and projection cannot offer accurate predictions of future events when the underlying mechanisms (and the contexts involved) are poorly understood and subject to change. Given the importance of understanding such mechanisms/contexts, and the limited opportunity for experimentation during outbreaks of novel diseases, the use of multivariable statistical models to estimate the strength/direction of potentially causal relationships between two variables (and the biases incurred through their misapplication/misinterpretation) warrant particular attention. Such models must be carefully designed to address: ‘selection-collider bias’, ‘unadjusted confounding bias’ and ‘inferential mediator adjustment bias’ – all of which can introduce effects capable of enhancing, masking or reversing the estimated (true) causal relationship between the two variables examined. Selection-collider bias occurs when these two variables independently cause a third (the ‘collider’), and when this collider determines/reflects the basis for selection in the analysis. It is likely to affect all incompletely representative samples, although its effects will be most pronounced wherever selection is constrained (e.g. analyses focusing on infected/hospitalised individuals). Unadjusted confounding bias disrupts the estimated (true) causal relationship between two variables when: these share one (or more) common cause(s); and when the effects of these causes have not been adjusted for in the analyses (e.g. whenever confounders are unknown/unmeasured). Inferentially similar biases can occur when: one (or more) variable(s) (or ‘mediators’) fall on the causal path between the two variables examined (i.e. when such mediators are caused by one of the variables and are causes of the other); and when these mediators are adjusted for in the analysis. Such adjustment is commonplace when: mediators are mistaken for confounders; prediction models are mistakenly repurposed for causal inference; or mediator adjustment is used to estimate direct and indirect causal relationships (in a mistaken attempt at ‘mediation analysis’). These three biases are central to ongoing and unresolved epistemological tensions within epidemiology. All have substantive implications for our understanding of COVID-19, and the future application of artificial intelligence to ‘data-driven’ modelling of similar phenomena. Nonetheless, competently applied and carefully interpreted, multivariable statistical models may yet provide sufficient insight into mechanisms and contexts to permit more accurate projections of future disease outbreaks. 1. These biases, and the terminology involved, may be challenging to readers who are unfamiliar with the use of causal path diagrams (such as Directed Acyclic Graphs; DAGs) which have been instrumental in identifying the different roles that variables can play in causal processes (whether as ‘exposures’, ‘outcomes’, ‘confounders’, ‘mediators’, ‘colliders’, ‘competing exposures’ or ‘consequences of the outcome’) and revealing hitherto under-acknowledged sources of bias in analyses designed to support causal inference. For what we hoped might offer accessible introductions to DAGs (and how [not] to use these) please see: Ellison (2020); and Tennant et al. (2019). For more technical detail on ‘collider bias’, ‘unadjusted confounding bias’ and ‘inferential mediator adjustment bias’ (and its related concern, the ‘Table 2 fallacy’), please refer to: Cook and Ranstam 2017; Munafò et al. (2018); Tennant et al. (2017); VanderWeele and Arah (2011); and Westreich and Greenland (2013). Article in Journal/Newspaper Greenland University of Central Lancashire: CLOK - Central Lancashire Online Knowledge Greenland Tennant ENVELOPE(-62.683,-62.683,-64.700,-64.700) Annals of Human Biology 47 6 506 513

COVID-19 and the epistemology of epidemiological models at the dawn of AI

Similar Items