Dual Reconstructive Autoencoder for Crowd Localization and Estimation in Density and FIDT Maps
Abstract
This paper proposes crowd estimation technology to help authorities make the right decisions in times of crisis. Specifically, deep learning models have faced these challenges, achieving excellent results. In particular, the trend of using single-column Fully Convolutional Networks (FCNs) has increased in recent years. A typical architecture that meets these characteristics is the autoencoder. However, this model presents an intrinsic difficulty: the search for the optimal dimensionality of the latent space. In order to alleviate such difficulty, we propose a dual architecture consisting of two cascaded autoencoders. The first autoencoder is responsible for carrying out the masked reconstruction of the original images, whereas the second obtains crowd maps from the outputs of the first one. In this way, our architecture improves the location of people and crowds in Focal Inverse Distance Transform (FIDT) maps, resulting in more accurate count estimates than estimates obtained through a single autoencoder architecture.
Más información
Título según WOS: | ID WOS:000886178800001 Not found in local WOS DB |
Título de la Revista: | IEEE ACCESS |
Volumen: | 10 |
Editorial: | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
Fecha de publicación: | 2022 |
Página de inicio: | 117399 |
Página final: | 117410 |
DOI: |
10.1109/ACCESS.2022.3219839 |
Notas: | ISI |