Using mid-infrared spectroscopy and supervised machine-learning to identify vertebrate blood meals in the malaria vector, Anopheles arabiensis

Abstract Background The propensity of different Anopheles mosquitoes to bite humans instead of other vertebrates influences their capacity to transmit pathogens to humans. Unfortunately, determining proportions of mosquitoes that have fed on humans, i.e. Human Blood Index (HBI), currently requires e...

Full description

Bibliographic Details
Published in:Malaria Journal
Main Authors: Emmanuel P. Mwanga, Salum A. Mapua, Doreen J. Siria, Halfan S. Ngowo, Francis Nangacha, Joseph Mgando, Francesco Baldini, Mario González Jiménez, Heather M. Ferguson, Klaas Wynne, Prashanth Selvaraj, Simon A. Babayan, Fredros O. Okumu
Format: Article in Journal/Newspaper
Language:English
Published: BMC 2019
Subjects:
Online Access:https://doi.org/10.1186/s12936-019-2822-y
https://doaj.org/article/14ee3ff3f4a74832afb0b458dbd58b30
Description
Summary:Abstract Background The propensity of different Anopheles mosquitoes to bite humans instead of other vertebrates influences their capacity to transmit pathogens to humans. Unfortunately, determining proportions of mosquitoes that have fed on humans, i.e. Human Blood Index (HBI), currently requires expensive and time-consuming laboratory procedures involving enzyme-linked immunosorbent assays (ELISA) or polymerase chain reactions (PCR). Here, mid-infrared (MIR) spectroscopy and supervised machine learning are used to accurately distinguish between vertebrate blood meals in guts of malaria mosquitoes, without any molecular techniques. Methods Laboratory-reared Anopheles arabiensis females were fed on humans, chickens, goats or bovines, then held for 6 to 8 h, after which they were killed and preserved in silica. The sample size was 2000 mosquitoes (500 per host species). Five individuals of each host species were enrolled to ensure genotype variability, and 100 mosquitoes fed on each. Dried mosquito abdomens were individually scanned using attenuated total reflection-Fourier transform infrared (ATR-FTIR) spectrometer to obtain high-resolution MIR spectra (4000 cm−1 to 400 cm−1). The spectral data were cleaned to compensate atmospheric water and CO2 interference bands using Bruker-OPUS software, then transferred to Python™ for supervised machine-learning to predict host species. Seven classification algorithms were trained using 90% of the spectra through several combinations of 75–25% data splits. The best performing model was used to predict identities of the remaining 10% validation spectra, which had not been used for model training or testing. Results The logistic regression (LR) model achieved the highest accuracy, correctly predicting true vertebrate blood meal sources with overall accuracy of 98.4%. The model correctly identified 96% goat blood meals, 97% of bovine blood meals, 100% of chicken blood meals and 100% of human blood meals. Three percent of bovine blood meals were misclassified as goat, and 2% ...