Validation of a province-wide commercial food store dataset in a heterogeneous predominantly rural food environment
OBJECTIVE: Commercially available business (CAB) datasets for food environments have been investigated for error in large urban contexts and some rural areas, but there is a relative dearth of literature that reports error across regions of variable rurality. The objective of the current study was t...
Published in: | Public Health Nutrition |
---|---|
Main Authors: | , , |
Format: | Text |
Language: | English |
Published: |
Cambridge University Press
2020
|
Subjects: | |
Online Access: | http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10200544/ http://www.ncbi.nlm.nih.gov/pubmed/32295655 https://doi.org/10.1017/S1368980019004506 |
Summary: | OBJECTIVE: Commercially available business (CAB) datasets for food environments have been investigated for error in large urban contexts and some rural areas, but there is a relative dearth of literature that reports error across regions of variable rurality. The objective of the current study was to assess the validity of a CAB dataset using a government dataset at the provincial scale. DESIGN: A ground-truthed dataset provided by the government of Newfoundland and Labrador (NL) was used to assess a popular commercial dataset. Concordance, sensitivity, positive-predictive value (PPV) and geocoding errors were calculated. Measures were stratified by store types and rurality to investigate any association between these variables and database accuracy. SETTING: NL, Canada. PARTICIPANTS: The current analysis used store-level (ecological) data. RESULTS: Of 1125 stores, there were 380 stores that existed in both datasets and were considered true-positive stores. The mean positional error between a ground-truthed and test point was 17·72 km. When compared with the provincial dataset of businesses, grocery stores had the greatest agreement, sensitivity = 0·64, PPV = 0·60 and concordance = 0·45. Gas stations had the least agreement, sensitivity = 0·26, PPV = 0·32 and concordance = 0·17. Only 4 % of commercial data points in rural areas matched every criterion examined. CONCLUSIONS: The commercial dataset exhibits a low level of agreement with the ground-truthed provincial data. Particularly retailers in rural areas or belonging to the gas station category suffered from misclassification and/or geocoding errors. Taken together, the commercial dataset is differentially representative of the ground-truthed reality based on store-type and rurality/urbanity. |
---|