Characterization of Sonic Events Present in Natural-Urban Hybrid Habitats Using UMAP and SEDnet: The Case of the Urban Wetlands

Poblete, Víctor; Espejo, Diego; Vargas, Víctor; Otondo, Felipe; Huijse, Pablo

Keywords: soundscape, urban wetlands, feature visualization, sonic event detection


We investigated whether the use of technological tools can effectively help in manipulating the increasing volume of audio data available through the use of long field recordings. We also explored whether we can address, by using these recordings and tools, audio data analysis, feature extraction and determine predominant patterns in the data. Similarly, we explored whether we can visualize feature clusters in the data and automatically detect sonic events. Our focus was primarily on enhancing the importance of natural-urban hybrid habitats within cities, which benefit communities in various ways, specifically through the natural soundscapes of these habitats that evoke memories and reinforce a sense of belonging for inhabitants. The loss of sonic heritage can be a precursor to the extinction of biodiversity within these habitats. By quantifying changes in the soundscape of these habitats over long periods of time, we can collect relevant information linked to this eventual loss. In this respect, we developed two approaches. The first was the comparison among habitats that progressively changed from natural to urban. The second was the optimization of the field recordings' labeling process. This was performed with labels corresponding to the annotations of classes of sonic events and their respective start and end times, including events temporarily superimposed on one another. We compared three habitats over time by using their sonic characteristics collected in field conditions. Comparisons of sonic similarity or dissimilarity among patches were made based on the Jaccard coefficient and uniform manifold approximation and projection (UMAP). Our SEDnet model achieves a F1-score of 0.79 with error rate 0.377 and with the area under PSD-ROC curve of 71.0. In terms of computational efficiency, the model is able to detect sound events from an audio file in a time of 14.49 s. With these results, we confirm the usefulness of the methods used in this work for the process of labeling field recordings.

Más información

Título de la Revista: APPLIED SCIENCES-BASEL
Volumen: 11
Número: 17
Fecha de publicación: 2021
Página de inicio: 1
Página final: 17
Idioma: English


Notas: Web of Science Core Collection