Probability distribution in a quantitative linguistic problem

Calderon, F.; Curilef, S.; Ladron de Guevara, M. L.

Keywords: probability distribution, zipf law, binder cumulants

Abstract

In the present contribution, we propose a possible way to discuss the distributions of words in a given text. We have devoted our study to discuss some relevant properties observed in Spanish texts of Latin-American writers. We start analyzing the appearance of distributions of the frequency of occurrence in the Zipf perspective. We identify two regions of behavior separated by a special point. In order to correctly define such a point, we work beyond the Zipf law, defining other probability distribution that takes the frequency of repetition of a particular word among other different words into account. At this point, we take the linguistic problem to a statistical level. We make an effort to characterize the point of separation between two regions, via the Binder cumulant of fourth order, as it is made in the characterization of critical points in phase transitions of physical systems.

Más información

Título de la Revista: BRAZILIAN JOURNAL OF PHYSICS
Volumen: 39
Número: 2a
Editorial: Springer
Fecha de publicación: 2009
Página de inicio: 500
Página final: 502
Idioma: English
Notas: ISI