Quantifying similarity in animal vocal sequences : which metric performs best?

E.C.G is supported by a Newton International Fellowship. Part of this work was conducted while E.C.G. was supported by a National Research Council (National Academy of Sciences) Postdoctoral Fellowship at the National Marine Mammal Laboratory, AFSC, NMFS, NOAA. 1. Many animals communicate using sequ...

Full description

Bibliographic Details
Published in:Methods in Ecology and Evolution
Main Authors: Kershenbaum, Arik, Garland, Ellen Clare
Other Authors: The Royal Society, University of St Andrews. School of Biology, University of St Andrews. Sea Mammal Research Unit, University of St Andrews. Centre for Social Learning & Cognitive Evolution, University of St Andrews. Centre for Biological Diversity
Format: Article in Journal/Newspaper
Language:English
Published: 2016
Subjects:
Online Access:https://hdl.handle.net/10023/9266
https://doi.org/10.1111/2041-210X.12433
http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12433/suppinfo
id ftstandrewserep:oai:research-repository.st-andrews.ac.uk:10023/9266
record_format openpolar
spelling ftstandrewserep:oai:research-repository.st-andrews.ac.uk:10023/9266 2024-04-28T08:23:24+00:00 Quantifying similarity in animal vocal sequences : which metric performs best? Kershenbaum, Arik Garland, Ellen Clare The Royal Society University of St Andrews. School of Biology University of St Andrews. Sea Mammal Research Unit University of St Andrews. Centre for Social Learning & Cognitive Evolution University of St Andrews. Centre for Biological Diversity 2016-08-07 504001 application/pdf https://hdl.handle.net/10023/9266 https://doi.org/10.1111/2041-210X.12433 http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12433/suppinfo eng eng Methods in Ecology and Evolution 198207159 1ba9eb03-f810-4194-b759-5219a42d9bc7 84958876893 000368517700009 Kershenbaum , A & Garland , E C 2015 , ' Quantifying similarity in animal vocal sequences : which metric performs best? ' , Methods in Ecology and Evolution , vol. 6 , no. 12 , pp. 1452-1461 . https://doi.org/10.1111/2041-210X.12433 2041-210X ORCID: /0000-0002-8240-1267/work/49580217 https://hdl.handle.net/10023/9266 doi:10.1111/2041-210X.12433 http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12433/suppinfo NF140667 Sequence Animal communication Vocal Edit distance Markov Stochastic processes QH301 Biology QH301 Journal article 2016 ftstandrewserep https://doi.org/10.1111/2041-210X.12433 2024-04-09T23:33:08Z E.C.G is supported by a Newton International Fellowship. Part of this work was conducted while E.C.G. was supported by a National Research Council (National Academy of Sciences) Postdoctoral Fellowship at the National Marine Mammal Laboratory, AFSC, NMFS, NOAA. 1. Many animals communicate using sequences of discrete acoustic elements which can be complex, vary in their degree of stereotypy, and are potentially open-ended. Variation in sequences can provide important ecological, behavioural, or evolutionary information about the structure and connectivity of populations, mechanisms for vocal cultural evolution, and the underlying drivers responsible for these processes. Various mathematical techniques have been used to form a realistic approximation of sequence similarity for such tasks. 2. Here, we use both simulated and empirical datasets from animal vocal sequences (rock hyrax, Procavia capensis; humpback whale, Megaptera novaeangliae; bottlenose dolphin, Tursiops truncatus; and Carolina chickadee, Poecile carolinensis) to test which of eight sequence analysis metrics are more likely to reconstruct the information encoded in the sequences, and to test the fidelity of estimation of model parameters, when the sequences are assumed to conform to particular statistical models. 3. Results from the simulated data indicated that multiple metrics were equally successful in reconstructing the information encoded in the sequences of simulated individuals (Markov chains, n-gram models, repeat distribution, and edit distance), and data generated by different stochastic processes (entropy rate and n-grams). However, the string edit (Levenshtein) distance performed consistently and significantly better than all other tested metrics (including entropy, Markov chains, n-grams, mutual information) for all empirical datasets, despite being less commonly used in the field of animal acoustic communication. 4. The Levenshtein distance metric provides a robust analytical approach that should be considered in the comparison of ... Article in Journal/Newspaper Humpback Whale Megaptera novaeangliae University of St Andrews: Digital Research Repository Methods in Ecology and Evolution 6 12 1452 1461
institution Open Polar
collection University of St Andrews: Digital Research Repository
op_collection_id ftstandrewserep
language English
topic Sequence
Animal communication
Vocal
Edit distance
Markov
Stochastic processes
QH301 Biology
QH301
spellingShingle Sequence
Animal communication
Vocal
Edit distance
Markov
Stochastic processes
QH301 Biology
QH301
Kershenbaum, Arik
Garland, Ellen Clare
Quantifying similarity in animal vocal sequences : which metric performs best?
topic_facet Sequence
Animal communication
Vocal
Edit distance
Markov
Stochastic processes
QH301 Biology
QH301
description E.C.G is supported by a Newton International Fellowship. Part of this work was conducted while E.C.G. was supported by a National Research Council (National Academy of Sciences) Postdoctoral Fellowship at the National Marine Mammal Laboratory, AFSC, NMFS, NOAA. 1. Many animals communicate using sequences of discrete acoustic elements which can be complex, vary in their degree of stereotypy, and are potentially open-ended. Variation in sequences can provide important ecological, behavioural, or evolutionary information about the structure and connectivity of populations, mechanisms for vocal cultural evolution, and the underlying drivers responsible for these processes. Various mathematical techniques have been used to form a realistic approximation of sequence similarity for such tasks. 2. Here, we use both simulated and empirical datasets from animal vocal sequences (rock hyrax, Procavia capensis; humpback whale, Megaptera novaeangliae; bottlenose dolphin, Tursiops truncatus; and Carolina chickadee, Poecile carolinensis) to test which of eight sequence analysis metrics are more likely to reconstruct the information encoded in the sequences, and to test the fidelity of estimation of model parameters, when the sequences are assumed to conform to particular statistical models. 3. Results from the simulated data indicated that multiple metrics were equally successful in reconstructing the information encoded in the sequences of simulated individuals (Markov chains, n-gram models, repeat distribution, and edit distance), and data generated by different stochastic processes (entropy rate and n-grams). However, the string edit (Levenshtein) distance performed consistently and significantly better than all other tested metrics (including entropy, Markov chains, n-grams, mutual information) for all empirical datasets, despite being less commonly used in the field of animal acoustic communication. 4. The Levenshtein distance metric provides a robust analytical approach that should be considered in the comparison of ...
author2 The Royal Society
University of St Andrews. School of Biology
University of St Andrews. Sea Mammal Research Unit
University of St Andrews. Centre for Social Learning & Cognitive Evolution
University of St Andrews. Centre for Biological Diversity
format Article in Journal/Newspaper
author Kershenbaum, Arik
Garland, Ellen Clare
author_facet Kershenbaum, Arik
Garland, Ellen Clare
author_sort Kershenbaum, Arik
title Quantifying similarity in animal vocal sequences : which metric performs best?
title_short Quantifying similarity in animal vocal sequences : which metric performs best?
title_full Quantifying similarity in animal vocal sequences : which metric performs best?
title_fullStr Quantifying similarity in animal vocal sequences : which metric performs best?
title_full_unstemmed Quantifying similarity in animal vocal sequences : which metric performs best?
title_sort quantifying similarity in animal vocal sequences : which metric performs best?
publishDate 2016
url https://hdl.handle.net/10023/9266
https://doi.org/10.1111/2041-210X.12433
http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12433/suppinfo
genre Humpback Whale
Megaptera novaeangliae
genre_facet Humpback Whale
Megaptera novaeangliae
op_relation Methods in Ecology and Evolution
198207159
1ba9eb03-f810-4194-b759-5219a42d9bc7
84958876893
000368517700009
Kershenbaum , A & Garland , E C 2015 , ' Quantifying similarity in animal vocal sequences : which metric performs best? ' , Methods in Ecology and Evolution , vol. 6 , no. 12 , pp. 1452-1461 . https://doi.org/10.1111/2041-210X.12433
2041-210X
ORCID: /0000-0002-8240-1267/work/49580217
https://hdl.handle.net/10023/9266
doi:10.1111/2041-210X.12433
http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12433/suppinfo
NF140667
op_doi https://doi.org/10.1111/2041-210X.12433
container_title Methods in Ecology and Evolution
container_volume 6
container_issue 12
container_start_page 1452
op_container_end_page 1461
_version_ 1797584378234142720