PDI - Resultado de Búsqueda

Abstract

Human action classification is an important task in computer vision. The Bag-of-Words model is a representation method very used in action classification techniques. In this work we propose an approach based on mid-level features representation for human action description. First, an optimal vocabulary is created without a preliminary number of visual words, which is a known problem of the K-means method. We introduce a graph-based video representation using the interest points relationships, in order to take into account the spatial and temporal layout. Finally, a second visual vocabulary based on n-grams is used for classification. This combines the representational power of graphs with the efficiency of the bag-of-words representation. The representation method was tested on the KTH dataset using STIP and MoSIFT descriptors and multi-class SVM with a chi-square kernel. The experimental results show that our approach using STIP descriptor outperforms the best results of state-of-art, meanwhile using MoSIFT descriptor are comparable to them.

Más información

Título según WOS:	ID WOS:000346407400039 Not found in local WOS DB
Título de la Revista:	COMPUTER VISION - ECCV 2024, PT LXXX
Volumen:	8827
Editorial:	SPRINGER INTERNATIONAL PUBLISHING AG
Fecha de publicación:	2014
Página de inicio:	319
Página final:	326
Notas:	ISI

Human Action Classification Using N-Grams Visual Vocabulary

Abstract

Más información