The Orchive: A system for semi-automatic annotation and analysis of a large collection of bioacoustic recordings

Advances in computer technology have enabled the collection, digitization and automated processing of huge archives of bioacoustic sound. Many of the tools previ- ously used in bioacoustics work well with small to medium-sized audio collections, but are challenged when processing large collections o...

Full description

Bibliographic Details
Main Author: Ness, Steven
Other Authors: Tzanetakis, George
Format: Thesis
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/1828/5109
Description
Summary:Advances in computer technology have enabled the collection, digitization and automated processing of huge archives of bioacoustic sound. Many of the tools previ- ously used in bioacoustics work well with small to medium-sized audio collections, but are challenged when processing large collections of tens of terabytes to petabyte size. In this thesis, a system is presented that assists researchers to listen to, view, anno- tate and run advanced audio feature extraction and machine learning algorithms on these audio recordings. This system is designed to scale to petabyte size. In addition, this system allows citizen scientists to participate in the process of annotating these large archives using a casual game metaphor. In this thesis, the use of this system to annotate a large audio archive called the Orchive will be evaluated. The Orchive contains over 20,000 hours of orca vocalizations collected over the course of 30 years, and represents one of the largest continuous collections of bioacoustic recordings in the world. The effectiveness of our semi-automatic approach for deriving knowledge from these recordings will be evaluated and results showing the utility of this system will be shown. Graduate 0984 sness@sness.net