Using deep learning to identify recent positive selection in malaria parasite sequence data
Abstract Background Malaria, caused by Plasmodium parasites, is a major global public health problem. To assist an understanding of malaria pathogenesis, including drug resistance, there is a need for the timely detection of underlying genetic mutations and their spread. With the increasing use of w...
Published in: | Malaria Journal |
---|---|
Main Authors: | , , , , , , |
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
BMC
2021
|
Subjects: | |
Online Access: | https://doi.org/10.1186/s12936-021-03788-x https://doaj.org/article/932d1e5320274ee9b76ff0a36efc77a6 |
id |
ftdoajarticles:oai:doaj.org/article:932d1e5320274ee9b76ff0a36efc77a6 |
---|---|
record_format |
openpolar |
spelling |
ftdoajarticles:oai:doaj.org/article:932d1e5320274ee9b76ff0a36efc77a6 2023-05-15T15:12:31+02:00 Using deep learning to identify recent positive selection in malaria parasite sequence data Wouter Deelder Ernest Diez Benavente Jody Phelan Emilia Manko Susana Campino Luigi Palla Taane G. Clark 2021-06-01T00:00:00Z https://doi.org/10.1186/s12936-021-03788-x https://doaj.org/article/932d1e5320274ee9b76ff0a36efc77a6 EN eng BMC https://doi.org/10.1186/s12936-021-03788-x https://doaj.org/toc/1475-2875 doi:10.1186/s12936-021-03788-x 1475-2875 https://doaj.org/article/932d1e5320274ee9b76ff0a36efc77a6 Malaria Journal, Vol 20, Iss 1, Pp 1-9 (2021) Plasmodium falciparum Plasmodium vivax Population genomics Drug resistance Machine learning Positive selection Arctic medicine. Tropical medicine RC955-962 Infectious and parasitic diseases RC109-216 article 2021 ftdoajarticles https://doi.org/10.1186/s12936-021-03788-x 2022-12-31T12:16:38Z Abstract Background Malaria, caused by Plasmodium parasites, is a major global public health problem. To assist an understanding of malaria pathogenesis, including drug resistance, there is a need for the timely detection of underlying genetic mutations and their spread. With the increasing use of whole-genome sequencing (WGS) of Plasmodium DNA, the potential of deep learning models to detect loci under recent positive selection, historically signals of drug resistance, was evaluated. Methods A deep learning-based approach (called “DeepSweep”) was developed, which can be trained on haplotypic images from genetic regions with known sweeps, to identify loci under positive selection. DeepSweep software is available from https://github.com/WDee/Deepsweep . Results Using simulated genomic data, DeepSweep could detect recent sweeps with high predictive accuracy (areas under ROC curve > 0.95). DeepSweep was applied to Plasmodium falciparum (n = 1125; genome size 23 Mbp) and Plasmodium vivax (n = 368; genome size 29 Mbp) WGS data, and the genes identified overlapped with two established extended haplotype homozygosity methods (within-population iHS, across-population Rsb) (~ 60–75% overlap of hits at P < 0.0001). DeepSweep hits included regions proximal to known drug resistance loci for both P. falciparum (e.g. pfcrt, pfdhps and pfmdr1) and P. vivax (e.g. pvmrp1). Conclusion The deep learning approach can detect positive selection signatures in malaria parasite WGS data. Further, as the approach is generalizable, it may be trained to detect other types of selection. With the ability to rapidly generate WGS data at low cost, machine learning approaches (e.g. DeepSweep) have the potential to assist parasite genome-based surveillance and inform malaria control decision-making. Article in Journal/Newspaper Arctic Directory of Open Access Journals: DOAJ Articles Arctic Malaria Journal 20 1 |
institution |
Open Polar |
collection |
Directory of Open Access Journals: DOAJ Articles |
op_collection_id |
ftdoajarticles |
language |
English |
topic |
Plasmodium falciparum Plasmodium vivax Population genomics Drug resistance Machine learning Positive selection Arctic medicine. Tropical medicine RC955-962 Infectious and parasitic diseases RC109-216 |
spellingShingle |
Plasmodium falciparum Plasmodium vivax Population genomics Drug resistance Machine learning Positive selection Arctic medicine. Tropical medicine RC955-962 Infectious and parasitic diseases RC109-216 Wouter Deelder Ernest Diez Benavente Jody Phelan Emilia Manko Susana Campino Luigi Palla Taane G. Clark Using deep learning to identify recent positive selection in malaria parasite sequence data |
topic_facet |
Plasmodium falciparum Plasmodium vivax Population genomics Drug resistance Machine learning Positive selection Arctic medicine. Tropical medicine RC955-962 Infectious and parasitic diseases RC109-216 |
description |
Abstract Background Malaria, caused by Plasmodium parasites, is a major global public health problem. To assist an understanding of malaria pathogenesis, including drug resistance, there is a need for the timely detection of underlying genetic mutations and their spread. With the increasing use of whole-genome sequencing (WGS) of Plasmodium DNA, the potential of deep learning models to detect loci under recent positive selection, historically signals of drug resistance, was evaluated. Methods A deep learning-based approach (called “DeepSweep”) was developed, which can be trained on haplotypic images from genetic regions with known sweeps, to identify loci under positive selection. DeepSweep software is available from https://github.com/WDee/Deepsweep . Results Using simulated genomic data, DeepSweep could detect recent sweeps with high predictive accuracy (areas under ROC curve > 0.95). DeepSweep was applied to Plasmodium falciparum (n = 1125; genome size 23 Mbp) and Plasmodium vivax (n = 368; genome size 29 Mbp) WGS data, and the genes identified overlapped with two established extended haplotype homozygosity methods (within-population iHS, across-population Rsb) (~ 60–75% overlap of hits at P < 0.0001). DeepSweep hits included regions proximal to known drug resistance loci for both P. falciparum (e.g. pfcrt, pfdhps and pfmdr1) and P. vivax (e.g. pvmrp1). Conclusion The deep learning approach can detect positive selection signatures in malaria parasite WGS data. Further, as the approach is generalizable, it may be trained to detect other types of selection. With the ability to rapidly generate WGS data at low cost, machine learning approaches (e.g. DeepSweep) have the potential to assist parasite genome-based surveillance and inform malaria control decision-making. |
format |
Article in Journal/Newspaper |
author |
Wouter Deelder Ernest Diez Benavente Jody Phelan Emilia Manko Susana Campino Luigi Palla Taane G. Clark |
author_facet |
Wouter Deelder Ernest Diez Benavente Jody Phelan Emilia Manko Susana Campino Luigi Palla Taane G. Clark |
author_sort |
Wouter Deelder |
title |
Using deep learning to identify recent positive selection in malaria parasite sequence data |
title_short |
Using deep learning to identify recent positive selection in malaria parasite sequence data |
title_full |
Using deep learning to identify recent positive selection in malaria parasite sequence data |
title_fullStr |
Using deep learning to identify recent positive selection in malaria parasite sequence data |
title_full_unstemmed |
Using deep learning to identify recent positive selection in malaria parasite sequence data |
title_sort |
using deep learning to identify recent positive selection in malaria parasite sequence data |
publisher |
BMC |
publishDate |
2021 |
url |
https://doi.org/10.1186/s12936-021-03788-x https://doaj.org/article/932d1e5320274ee9b76ff0a36efc77a6 |
geographic |
Arctic |
geographic_facet |
Arctic |
genre |
Arctic |
genre_facet |
Arctic |
op_source |
Malaria Journal, Vol 20, Iss 1, Pp 1-9 (2021) |
op_relation |
https://doi.org/10.1186/s12936-021-03788-x https://doaj.org/toc/1475-2875 doi:10.1186/s12936-021-03788-x 1475-2875 https://doaj.org/article/932d1e5320274ee9b76ff0a36efc77a6 |
op_doi |
https://doi.org/10.1186/s12936-021-03788-x |
container_title |
Malaria Journal |
container_volume |
20 |
container_issue |
1 |
_version_ |
1766343196495839232 |