Summary: | With increasing concern for marine species extinction, a massive effort has been made to conserve, prevent, and search for a sustainable solution. However, data labeling is a labor-heavy and time-consuming work, resulting in limited annotated acoustic data. What's more, a majority of labeled acoustic data are background noise. Both issues together raise interests in searching for solutions on how to effectively train a reliable classification model. We simulate different degrees of data compositions to study the impact of data scarcity and class imbalance on the North Atlantic Right Whale (NARW) acoustic data. In the meantime, we explore two types of supervised deep learning approaches: metric-based classifiers and cross-entropy based classifiers. The empirical results show that our classifiers trained with fewer NARW acoustic data have comparable performance to the-state-of-art classifiers trained with a larger amount of acoustic data.
|