Spatial, Temporal and Spectral Multiresolution Analysis for the INTERSPEECH 2019 ComParE Challenge

The INTERSPEECH 2019 Orca Activity Challenge consists in the detection of the Orca sounds from underwater audio signal. Orca can produce a wide variety of sounds categorized in clicks, whistles and pulsed calls. Clicks are useful for echolocation, whistles and pulsed calls are used as social signals...

Full description

Bibliographic Details
Published in:Interspeech 2019
Main Authors: Caraty, Marie-José, Montacié, Claude
Other Authors: Équipe Linguistique computationnelle (STIH-LC), Sens, Texte, Informatique, Histoire (STIH), Sorbonne Université (SU)-Sorbonne Université (SU)
Format: Conference Object
Language:English
Published: HAL CCSD 2019
Subjects:
Online Access:https://hal.science/hal-03946087
https://doi.org/10.21437/Interspeech.2019-1693
Description
Summary:The INTERSPEECH 2019 Orca Activity Challenge consists in the detection of the Orca sounds from underwater audio signal. Orca can produce a wide variety of sounds categorized in clicks, whistles and pulsed calls. Clicks are useful for echolocation, whistles and pulsed calls are used as social signals. Experiments were conducted on DeepAL Fieldwork Data (DLFD). Underwater sounds were recorded in northern British Columbia by a hydrophones array. Recordings were labeled by marine biologists in Orca sounds or Noise. We have investigated multiresolution analysis according to the three main relevant acoustic levels: spatial, temporal and spectral. For this purpose, we studied the beamforming array analysis, the multitemporal resolution and the multilevel wavelet decomposition. For the spatial level, a beamforming algorithm was used for denoising the underwater audio signal.For the temporal level, two sets of multitemporal three-level features were extracted using pyramidal representation. For the spectral level, in order to detect transient sound, waveletanalysis was computed using various wavelet families. At last, an Orca Activity detector was designed combining ComParE set with multitemporal and multilevel wavelet features.Experiments on the Test set have shown a significant improvement of 0.051, compared to the baseline performance of the Challenge (0.866)