Unbiased evacuations processes using a reinforcement learning approach
Abstract
Simulations of collective phenomena require the modeling of individual choices. In evacuations, handpicked policies may produce biased results. Here, we remove that bias using reinforcement learning. This technique allows for the construction of non-optimal solutions of each agent's trajectory but improves the performance of the whole ensemble. Our analysis reveals that evacuation time can decrease up to 12% compared to the strategy of following the shortest path, which is a more standard approach. Our simulations also show that the reinforcement algorithm causes the agents to distribute themselves in amore homogeneous way while advancing towards the exits, resulting in fewer collisions. Moreover, we found, as expected, that collisions and evacuation time are strongly correlated and discovered that such a relationship is policy-independent. Our work opens up new research venues to study evacuations and leverages the potentiality of new machine-learning techniques to study collective phenomena.
Más información
Título según WOS: | ID WOS:001393821300001 Not found in local WOS DB |
Título según SCOPUS: | ID SCOPUS_ID:85212592147 Not found in local SCOPUS DB |
Título de la Revista: | CHAOS SOLITONS & FRACTALS |
Volumen: | 191 |
Editorial: | PERGAMON-ELSEVIER SCIENCE LTD |
Fecha de publicación: | 2025 |
Página de inicio: | 115924 |
Página final: | 115924 |
DOI: |
10.1016/J.CHAOS.2024.115924 |
Notas: | ISI, SCOPUS |