Human Action Classification Using N-Grams Visual Vocabulary

Hernandez-Garcia, Ruber; Garcia-Reyes, Edel; Ramos-Cozar, Julian; Guil, Nicolas; BayroCorrochano, E; Hancock, E

Abstract

Human action classification is an important task in computer vision. The Bag-of-Words model is a representation method very used in action classification techniques. In this work we propose an approach based on mid-level features representation for human action description. First, an optimal vocabulary is created without a preliminary number of visual words, which is a known problem of the K-means method. We introduce a graph-based video representation using the interest points relationships, in order to take into account the spatial and temporal layout. Finally, a second visual vocabulary based on n-grams is used for classification. This combines the representational power of graphs with the efficiency of the bag-of-words representation. The representation method was tested on the KTH dataset using STIP and MoSIFT descriptors and multi-class SVM with a chi-square kernel. The experimental results show that our approach using STIP descriptor outperforms the best results of state-of-art, meanwhile using MoSIFT descriptor are comparable to them.

Más información

Título según WOS: ID WOS:000346407400039 Not found in local WOS DB
Título de la Revista: LEARNING AND INTELLIGENT OPTIMIZATION, LION 15
Volumen: 8827
Editorial: SPRINGER INTERNATIONAL PUBLISHING AG
Fecha de publicación: 2014
Página de inicio: 319
Página final: 326
Notas: ISI