Voice-based human-robot social integration

Néstor Becerra Yoma


Cooperation is a key concept employed in multi-robot systems for targeting complex tasks. In swarm robotics, a self-organized cooperation is adopted, where robots with limited intelligence cooperate and interact locally to build up the desired global behavior. In this context robot communication scenario plays a critical role. However, there are several applications in defense, hostile environments, mining, industry, forestry, education and natural disasters where some integration and collaboration between humans and robots will be required. Human robot interaction (HRI) is relevant in those situations when robots are not fully autonomous and require human instructions in decision-making applications. Robots can get into difficulties when accomplishing a given task and, in many cases, inputs from a human operator or user is enough to solve the problems. If the robot is able to discuss the situation with the human operator, a better solution can be found. If the robot knows not only its capacities and limitations, but also what human expect from them, the human-robot interaction and collaboration should have a significant effect on the task accomplishment. This is applied to defense, education, home, health services, industry, rescue and museum applications where robots need to interact with children and adults. Also, human like communication between a person and a robot is essential for successful HRI and human-robot collaborative symbiosis and speech is the most straightforward and easiest way that humans employ to communicate and should be the most natural way to make possible a collaborative human-robot symbiosis. In this project, state-of-the-art research on robust automatic speech recognition (ASR) for HRI will be carried out by making use of an available collaborative robot swarm and human-robot social integration test bed. Text-to-speech (TTS) will be employed in human-robot and robot-robot communication to allow human operators to acquire the context of robots. Image processing methods will also be evaluated as means to communicate humans the situation and environment that robots are going through. Images and data sent by the robotic system can be displayed in portable devices. TTS and image processing technology play key roles to enable friendly and robust HRI. Particularly, TTS allows humans to listen in robot-robot interaction (RRI). During the first year the application tasks that will be addressed in the project will be defined and the experimental robot swarm test-bed will be set up. This includes: to program the robots; integration of ASR and TTS technologies on a client-server basis to increase the computational capabilities of the robotic platform; and, databases will be recorded to carry out off-line experiments. Collaborative tasks that include scene understanding and manipulation of objects can be considered. In the second year the first ASR results will be obtained. The relevance of context-dependent information will be evaluated by off-line ASR experiments. Also, papers will be submitted, and demos will be performed to show the advantages and potential of the proposed approach. In the third year of this project the problem of context acquisition by human users will also be addressed by making use of image processing methods to communicate humans the situation and environment that robots are going through. Images and data sent by the robotic systems can be displayed in portable devices. Finally, the ASR robustness will also be tackled.

Más información

Fecha de publicación: 2017
Año de Inicio/Término: 2017-2019
Financiamiento/Sponsor: ONR Global