VOCHITEXT: Vocabulary Corpus of Chilean Children’s Textbooks
Keywords: linguistics, spanish language, Vocabulary, elementary school
Abstract
The VOCHITEXT is a specialized corpus of 23,297 Spanish (Chilean) words, curated to represent the core vocabulary used in Chilean primary education. Derived from 58 complete texts (student books) collected in 2022, it encompasses vocabulary from science, history, language, social science, and mathematics official textbooks, spanning preschool through 4th grade. For preschool, it includes words from reading materials (children's tales) recommended by the Chilean Ministry of Education. This corpus, originally developed for the LEXIKON App project (ANID Fondef, IT21I0078, University of Concepcion) to assess children's lexical development, provides a representative sample of school vocabulary. The VOCHITEXT is a valuable resource for researchers, educators, and developers interested in Chilean primary education, vocabulary acquisition, and linguistic analysis of educational materials, offering insights into the lexical content encountered by Chilean students across different subjects and grade levels.
Más información
Título de la Revista: | Mendeley Data |
Volumen: | V1 |
Fecha de publicación: | 2024 |
URL: | https://data.mendeley.com/datasets/ttb8rk5t53/1 |
DOI: |
DOI: 10.17632/ttb8rk5t53.1 |
Notas: | WOS |