Spectrograms of baleen whale records synthesized from Autoenconder architectures: CAE, VAE and CAE-LSTM

In this paper, different architectures of simple convolutional networks are analyzed to generate synthetic spectrograms corresponding to baleen whales. Simplicity in these models plays an important role in the implementations of these type of networks on embedded systems. In addition, the scarcity o...

Full description

Bibliographic Details
Published in:	Elektron
Main Authors:	Cabedio, María Celeste, Carnaghi, Marco
Format:	Article in Journal/Newspaper
Language:	Spanish
Published:	FIUBA 2022
Subjects:	Convolutional autoencoders recursive layers spectrograms underwater sound synthesis Autoencoders convolucionales Capas recursivas espectrogramas sonidos subcuáticos síntesis Baja Ballenas Vae baleen whale baleen whales
Online Access:	http://elektron.fi.uba.ar/index.php/elektron/article/view/167 https://doi.org/10.37537/rev.elektron.6.2.167.2022

Description
Summary:	In this paper, different architectures of simple convolutional networks are analyzed to generate synthetic spectrograms corresponding to baleen whales. Simplicity in these models plays an important role in the implementations of these type of networks on embedded systems. In addition, the scarcity of available data requires the generation of efficient models. With this aim in mind, simple Autoencoder architectures with a low number of as- sociated parameters are presented and trained in this paper. Then, adequate metrics are obtained and the corresponding comparison among the architecture alternatives is made. The obtained results show that the more straightforward architecture is, in turn, the most convenient. Finally, from these models, synthetic spectrograms are generated from few data samples are generated, employing a low complexity architecture and assuming a normal distribution of the latent space vectors from the training data. En este trabajo se analizan diferentes arquitecturas de redes convolucionales sencillas para generar espectrogramas sintéticos correspondientes a registros de audio de ballenas barbadas. La sencillez en el modelo juega un rol importante en las implementaciones de este tipo de redes sobre sistemas embebidos. Además, existe una necesidad de generar modelos eficientes frente a la escasez de datos disponibles para este tipo de aplicaciones. Con tal fin, se presentan arquitecturas de Autoencoders simples y de baja cantidad de parámetros asociados, se entrenan los modelos, se obtienen métricas adecuadas y se realizan las correspondientes comparaciones. Los resultados obtenidos demuestran que la arquitectura con una implementación más directa es, a su vez, la más conveniente. Finalmente, a partir de estos modelos, se generan espectrogramas sintéticos a partir de pocos datos de muestra, empleando una arquitectura de baja complejidad y asumiendo una distribución normal de los vectores reales.

Spectrograms of baleen whale records synthesized from Autoenconder architectures: CAE, VAE and CAE-LSTM

Similar Items