ORCA-CLEAN:A deep denoising toolkit for killer whale communication

In bioacoustics, passive acoustic monitoring of animals living in the wild, both on land and underwater, leads to large data archives characterized by a strong imbalance between recorded animal sounds and ambient noises. Bioacoustic datasets suffer extremely from such large noise-variety, caused by...

Full description

Bibliographic Details
Published in:Interspeech 2020
Main Authors: Bergler, Christian, Smeele, Simeon, Schmitt, Manuel, Maier, Andreas, Barth, Volker, Nöth, Elmar
Format: Article in Journal/Newspaper
Language:English
Published: 2020
Subjects:
Online Access:https://pure.au.dk/portal/da/publications/orcaclean(be2a5cfe-09b8-4fc5-ac6c-ed83a71b92f5).html
https://doi.org/10.21437/Interspeech.2020-1316
http://www.scopus.com/inward/record.url?scp=85098169356&partnerID=8YFLogxK
Description
Summary:In bioacoustics, passive acoustic monitoring of animals living in the wild, both on land and underwater, leads to large data archives characterized by a strong imbalance between recorded animal sounds and ambient noises. Bioacoustic datasets suffer extremely from such large noise-variety, caused by a multitude of external influences and changing environmental conditions over years. This leads to significant deficiencies/problems concerning the analysis and interpretation of animal vocalizations by biologists and machine-learning algorithms. To counteract such huge noise diversity, it is essential to develop a denoising procedure enabling automated, efficient, and robust data enhancement. However, a fundamental problem is the lack of clean/denoised ground-truth samples. The current work is the first presenting a fully-automated deep denoising approach for bioacoustics, not requiring any clean ground-truth, together with one of the largest data archives recorded on killer whales (Orcinus Orca) - the Orchive. Therefor, an approach, originally developed for image restoration, known as Noise2Noise (N2N), was transferred to the field of bioacoustics, and extended by using automatic machine-generated binary masks as additional network attention mechanism. Besides a significant cross-domain signal enhancement, our previous results regarding supervised orca/noise segmentation and orca call type identification were outperformed by applying ORCA-CLEAN as additional data preprocessing/enhancement step.