Improving Bag-of-Visual-Words model using visual n-grams for human action classification

Hernandez-Garcia, Ruber; Ramos-Cozar, Julian; Guil, Nicolas; Garcia-Reyes, Edel; Sahli, Hichem

Abstract

The Bag-of-Visual-Words model has emerged as an effective approach to represent local video features for human actions classification. However, one of the major challenges in this model is the generation of the visual vocabulary. In the case of human action recognition, losing spatial-temporal relationships is one of the important reasons that provokes the low descriptive power of classic visual words. In this work we propose a three-level approach to construct visual n-grams for human action classification. First, in order to reduce the number of non-descriptive words generated by K-means clustering of the spatio-temporal interest points, we propose to apply a variant of the classical Leader-Follower clustering algorithm to create an optimal vocabulary from a pre-established number of visual words. Second, with the aim of incorporating spatial and temporal constraints to the Bag-of-Visual-Words model, we exploit the spatio-temporal relationships between interest points to build a graph-based representation of the video. Frequent subgraphs are extracted for each action class and a visual vocabulary of n-grams is constructed from the labels (descriptors) of selected subgraphs. Finally, we build a histogram by using the frequency of each n-gram in the graph representing a video of human action. The proposed approach combines the representational power of graphs with the efficiency of the Bag-of-Visual-Words model. Extensive validation on five challenging human actions datasets demonstrates the effectiveness of the proposed model compared to state-of-the-art methods. (C) 2017 Elsevier Ltd. All rights reserved.

Más información

Título según WOS: ID WOS:000414107100016 Not found in local WOS DB
Título de la Revista: EXPERT SYSTEMS WITH APPLICATIONS
Volumen: 92
Editorial: PERGAMON-ELSEVIER SCIENCE LTD
Fecha de publicación: 2018
Página de inicio: 182
Página final: 191
DOI:

10.1016/j.eswa.2017.09.016

Notas: ISI