Collective annotation patterns in learning from crowds

Mena, Francisco; Nanculef, Ricardo; Valle, Carlos

Abstract

--- - The lack of annotated data is one of the major barriers facing machine learning applications today. Learning from crowds, i.e. collecting ground-truth data from multiple inexpensive annotators, has become a common method to cope with this issue. It has been recently shown that modeling the varying quality of the annotations obtained in this way, is fundamental to obtain satisfactory performance in tasks where inexpert annotators may represent the majority but not the most trusted group. Unfortunately, existing techniques represent annotation patterns for each annotator individually, making the models difficult to estimate in large-scale scenarios. - In this paper, we present two models to address these problems. Both methods are based on the hypothesis that it is possible to learn collective annotation patterns by introducing confusion matrices that involve groups of data point annotations or annotators. The first approach clusters data points with a common annotation pattern, regardless the annotators from which the labels have been obtained. Implicitly, this method attributes annotation mistakes to the complexity of the data itself and not to the variable behavior of the annotators. The second approach explicitly maps annotators to latent groups that are collectively parametrized to learn a common annotation pattern. Our experimental results show that, compared with other methods for learning from crowds, both methods have advantages in scenarios with a large number of annotators and a small number of annotations per annotator.

Más información

Título según WOS: Collective annotation patterns in learning from crowds
Título de la Revista: INTELLIGENT DATA ANALYSIS
Volumen: 24
Editorial: IOS Press
Fecha de publicación: 2020
Página de inicio: S63
Página final: S86
DOI:

10.3233/IDA-200009

Notas: ISI