Prediction of dengue incidence using search query surveillance.

Background The use of internet search data has been demonstrated to be effective at predicting influenza incidence. This approach may be more successful for dengue which has large variation in annual incidence and a more distinctive clinical presentation and mode of transmission. Methods We gathered...

Full description

Bibliographic Details
Published in:PLoS Neglected Tropical Diseases
Main Authors: Benjamin M Althouse, Yih Yng Ng, Derek A T Cummings
Format: Article in Journal/Newspaper
Language:English
Published: Public Library of Science (PLoS) 2011
Subjects:
Online Access:https://doi.org/10.1371/journal.pntd.0001258
https://doaj.org/article/9a90bbf1c9cb4598894b346a7cc837f8
id ftdoajarticles:oai:doaj.org/article:9a90bbf1c9cb4598894b346a7cc837f8
record_format openpolar
spelling ftdoajarticles:oai:doaj.org/article:9a90bbf1c9cb4598894b346a7cc837f8 2023-05-15T15:16:50+02:00 Prediction of dengue incidence using search query surveillance. Benjamin M Althouse Yih Yng Ng Derek A T Cummings 2011-08-01T00:00:00Z https://doi.org/10.1371/journal.pntd.0001258 https://doaj.org/article/9a90bbf1c9cb4598894b346a7cc837f8 EN eng Public Library of Science (PLoS) https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21829744/?tool=EBI https://doaj.org/toc/1935-2727 https://doaj.org/toc/1935-2735 1935-2727 1935-2735 doi:10.1371/journal.pntd.0001258 https://doaj.org/article/9a90bbf1c9cb4598894b346a7cc837f8 PLoS Neglected Tropical Diseases, Vol 5, Iss 8, p e1258 (2011) Arctic medicine. Tropical medicine RC955-962 Public aspects of medicine RA1-1270 article 2011 ftdoajarticles https://doi.org/10.1371/journal.pntd.0001258 2022-12-31T15:46:35Z Background The use of internet search data has been demonstrated to be effective at predicting influenza incidence. This approach may be more successful for dengue which has large variation in annual incidence and a more distinctive clinical presentation and mode of transmission. Methods We gathered freely-available dengue incidence data from Singapore (weekly incidence, 2004-2011) and Bangkok (monthly incidence, 2004-2011). Internet search data for the same period were downloaded from Google Insights for Search. Search terms were chosen to reflect three categories of dengue-related search: nomenclature, signs/symptoms, and treatment. We compared three models to predict incidence: a step-down linear regression, generalized boosted regression, and negative binomial regression. Logistic regression and Support Vector Machine (SVM) models were used to predict a binary outcome defined by whether dengue incidence exceeded a chosen threshold. Incidence prediction models were assessed using r² and Pearson correlation between predicted and observed dengue incidence. Logistic and SVM model performance were assessed by the area under the receiver operating characteristic curve. Models were validated using multiple cross-validation techniques. Results The linear model selected by AIC step-down was found to be superior to other models considered. In Bangkok, the model has an r² = 0.943, and a correlation of 0.869 between fitted and observed. In Singapore, the model has an r² = 0.948, and a correlation of 0.931. In both Singapore and Bangkok, SVM models outperformed logistic regression in predicting periods of high incidence. The AUC for the SVM models using the 75th percentile cutoff is 0.906 in Singapore and 0.960 in Bangkok. Conclusions Internet search terms predict incidence and periods of large incidence of dengue with high accuracy and may prove useful in areas with underdeveloped surveillance systems. The methods presented here use freely available data and analysis tools and can be readily adapted to other settings. Article in Journal/Newspaper Arctic Directory of Open Access Journals: DOAJ Articles Arctic PLoS Neglected Tropical Diseases 5 8 e1258
institution Open Polar
collection Directory of Open Access Journals: DOAJ Articles
op_collection_id ftdoajarticles
language English
topic Arctic medicine. Tropical medicine
RC955-962
Public aspects of medicine
RA1-1270
spellingShingle Arctic medicine. Tropical medicine
RC955-962
Public aspects of medicine
RA1-1270
Benjamin M Althouse
Yih Yng Ng
Derek A T Cummings
Prediction of dengue incidence using search query surveillance.
topic_facet Arctic medicine. Tropical medicine
RC955-962
Public aspects of medicine
RA1-1270
description Background The use of internet search data has been demonstrated to be effective at predicting influenza incidence. This approach may be more successful for dengue which has large variation in annual incidence and a more distinctive clinical presentation and mode of transmission. Methods We gathered freely-available dengue incidence data from Singapore (weekly incidence, 2004-2011) and Bangkok (monthly incidence, 2004-2011). Internet search data for the same period were downloaded from Google Insights for Search. Search terms were chosen to reflect three categories of dengue-related search: nomenclature, signs/symptoms, and treatment. We compared three models to predict incidence: a step-down linear regression, generalized boosted regression, and negative binomial regression. Logistic regression and Support Vector Machine (SVM) models were used to predict a binary outcome defined by whether dengue incidence exceeded a chosen threshold. Incidence prediction models were assessed using r² and Pearson correlation between predicted and observed dengue incidence. Logistic and SVM model performance were assessed by the area under the receiver operating characteristic curve. Models were validated using multiple cross-validation techniques. Results The linear model selected by AIC step-down was found to be superior to other models considered. In Bangkok, the model has an r² = 0.943, and a correlation of 0.869 between fitted and observed. In Singapore, the model has an r² = 0.948, and a correlation of 0.931. In both Singapore and Bangkok, SVM models outperformed logistic regression in predicting periods of high incidence. The AUC for the SVM models using the 75th percentile cutoff is 0.906 in Singapore and 0.960 in Bangkok. Conclusions Internet search terms predict incidence and periods of large incidence of dengue with high accuracy and may prove useful in areas with underdeveloped surveillance systems. The methods presented here use freely available data and analysis tools and can be readily adapted to other settings.
format Article in Journal/Newspaper
author Benjamin M Althouse
Yih Yng Ng
Derek A T Cummings
author_facet Benjamin M Althouse
Yih Yng Ng
Derek A T Cummings
author_sort Benjamin M Althouse
title Prediction of dengue incidence using search query surveillance.
title_short Prediction of dengue incidence using search query surveillance.
title_full Prediction of dengue incidence using search query surveillance.
title_fullStr Prediction of dengue incidence using search query surveillance.
title_full_unstemmed Prediction of dengue incidence using search query surveillance.
title_sort prediction of dengue incidence using search query surveillance.
publisher Public Library of Science (PLoS)
publishDate 2011
url https://doi.org/10.1371/journal.pntd.0001258
https://doaj.org/article/9a90bbf1c9cb4598894b346a7cc837f8
geographic Arctic
geographic_facet Arctic
genre Arctic
genre_facet Arctic
op_source PLoS Neglected Tropical Diseases, Vol 5, Iss 8, p e1258 (2011)
op_relation https://www.ncbi.nlm.nih.gov/pmc/articles/pmid/21829744/?tool=EBI
https://doaj.org/toc/1935-2727
https://doaj.org/toc/1935-2735
1935-2727
1935-2735
doi:10.1371/journal.pntd.0001258
https://doaj.org/article/9a90bbf1c9cb4598894b346a7cc837f8
op_doi https://doi.org/10.1371/journal.pntd.0001258
container_title PLoS Neglected Tropical Diseases
container_volume 5
container_issue 8
container_start_page e1258
_version_ 1766347116897107968