Discrimination of fish populations using parasites: Random Forests on a ‘predictable’ host-parasite system
We address the effect of spatial scale and temporal variation on model generality when forming predictive models for fish assignment using a new data mining approach, Random Forests (RF), to variable biological markers (parasite community data). Models were implemented for a fish host-parasite syste...
Published in: | Parasitology |
---|---|
Main Authors: | , , , , , |
Format: | Article in Journal/Newspaper |
Language: | English |
Published: |
2010
|
Subjects: | |
Online Access: | https://doi.org/10.1017/S0031182010000739 http://hdl.handle.net/11104/0192704 |
Summary: | We address the effect of spatial scale and temporal variation on model generality when forming predictive models for fish assignment using a new data mining approach, Random Forests (RF), to variable biological markers (parasite community data). Models were implemented for a fish host-parasite system sampled along the Mediterranean and Atlantic coasts of Spain. The main results are that (i) RF are well suited for multiclass population assignment using parasite communities in non-migratory fish; (ii) RF provide an efficient means for model cross-validation on the baseline data and this allows sample size limitations in parasite tag studies to be tackled effectively; (iii) the performance of RF is dependent on the complexity and spatial extent/configuration of the problem; and (iv) the development of predictive models is strongly influenced by seasonal change and this stresses the importance of both temporal replication and model validation in parasite tagging studies. |
---|