Assessing the effect of visual servoing on the performance of linear microphone arrays in moving human-robot interaction scenarios

Diaz, Alejandro; Mahu, Rodrigo; Novoa, Jose; Wuth, Jorge; Datta, Jayanta; Yoma, Nestor Becerra

Abstract

Social robotics is becoming a reality and voice-based human-robot interaction is essential for a successful human-robot collaborative symbiosis. The main objective of this paper is to assess the effect of visual servoing in the performance of a linear microphone array regarding distant ASR in a mobile, dynamic and non-stationary robotic test-bed that can be representative of real HRI scenarios. Visual servoing and image target tracking are different tasks, and this paper focuses on an effect that is rarely addressed in the literature: the dependence of the beamforming directivity on look direction. The datasets required to carry out the study reported here do not exist and had to be generated. A state-of-the-art mobile robotic testbed had to be set up with target speech and noise sources. A linear microphone array was chosen as a case of study and its response was measured. Standard beamforming methods were evaluated with respect to visual servoing: delay-and-sum combined with image tracking; weighted delay-and-sum; and, MVDR also combined with image tracking. The results presented here show that the performance of beamforming methods is dramatically degraded in moving and non-stationary conditions. In this context, visual servoing in HRI can significantly improve the performance of a linear microphone array regarding ASR accuracy. The average reduction in WER achieved when the robot head was steered toward the target speech source was as high as 28.2%. Finally, it is worth highlighting that the methodology adopted here is applicable to any microphone array, linear or not. (C) 2020 Elsevier Ltd. All rights reserved.

Más información

Título según WOS: Assessing the effect of visual servoing on the performance of linear microphone arrays in moving human-robot interaction scenarios
Título de la Revista: COMPUTER SPEECH AND LANGUAGE
Volumen: 65
Editorial: ACADEMIC PRESS LTD- ELSEVIER SCIENCE LTD
Fecha de publicación: 2021
DOI:

10.1016/j.csl.2020.101136

Notas: ISI