Dynamic Environment Adaptation in Biped Robots: A Reinforcement Learning Approach

Soto, M; Ormeño-Arriagada, P; Vásquez J.; Taramasco C.; Vasconez J.P.; Gatica G.

Keywords: training, stability analysis, visualization, reinforcement learning, robots, legged locomotion, Real-time systems, Robot sensing systems, Central Pattern Generator, Variational Autoencoder, Aerospace electronics, Robot kinematics, Bipedal robot, biped locomotion

Abstract

This study explores the integration of rhythmic control mechanisms and latent visual encoding to enhance bipedal locomotion in complex environments. While model-free reinforcement learning has shown effectiveness in acquiring locomotion skills, the high dimensionality of visual inputs often leads to increased sample complexity and prolonged training times. To mitigate this issue, we propose a novel approach that combines rhythmic motor priors with latent space representations, thereby reducing the complexity of control inputs. Specifically, we introduce a semi-coupled Central Pattern Generator architecture to efficiently coordinate movement while leveraging compact visual embeddings. To address this challenge, we propose integrating rhythmic motor priors with latent-space perception to reduce the complexity of control inputs. This approach compresses high-dimensional observations into a lower-dimensional representation, thereby accelerating policy training and enhancing generalisation across varied environments. We validated our method in the Webots simulator using a four-level curriculum with progressively complex obstacles. In contrast to previous approaches that rely on 80 to 245 input variables, our framework reduces the observation space to just 40 dimensions. This results in stable bipedal walking in fewer than 10 million simulation steps and yields a 16% higher success rate in environments with dense obstacles. Moreover, the semi-coupled model achieves a walking speed of 21.08 cm/s, substantially outperforming the previously reported 1.79 cm/s on the same robot using reinforcement learning and inverse kinematics. An analysis of training time shows that the model converges more rapidly in simpler environments, while tasks with greater complexity require extended adaptation periods. This increased difficulty translates into longer overall training durations. Nevertheless, the improvements in success rate, walking speed, and generalisation justify the additional training time. These gains are statistically validated through formal t-tests, confirming significant improvements in success rate, stability, and velocity (p < 0.05). Collectively, the results demonstrate that integrating rhythmic motor priors with latent-space perception provides a scalable and efficient approach to vision-based bipedal locomotion. © 2013 IEEE.

Más información

Título según WOS: Dynamic Environment Adaptation in Biped Robots: A Reinforcement Learning Approach
Título según SCOPUS: Dynamic Environment Adaptation in Biped Robots: A Reinforcement Learning Approach
Título de la Revista: IEEE Access
Volumen: 13
Editorial: Institute of Electrical and Electronics Engineers Inc.
Fecha de publicación: 2025
Página de inicio: 159157
Página final: 159177
Idioma: English
DOI:

10.1109/ACCESS.2025.3604236

Notas: ISI, SCOPUS