Learning Sentence-Level Representations with Predictive Coding

Araujo, Vladimir; Moens, Marie-Francine; Soto, Alvaro

Abstract

Learning sentence representations is an essential and challenging topic in the deep learning and natural language processing communities. Recent methods pre-train big models on a massive text corpus, focusing mainly on learning the representation of contextualized words. As a result, these models cannot generate informative sentence embeddings since they do not explicitly exploit the structure and discourse relationships existing in contiguous sentences. Drawing inspiration from human language processing, this work explores how to improve sentence-level representations of pre-trained models by borrowing ideas from predictive coding theory. Specifically, we extend BERT-style models with bottom-up and top-down computation to predict future sentences in latent space at each intermediate layer in the networks. We conduct extensive experimentation with various benchmarks for the English and Spanish languages, designed to assess sentence- and discourse-level representations and pragmatics-focused assessments. Our results show that our approach improves sentence representations consistently for both languages. Furthermore, the experiments also indicate that our models capture discourse and pragmatics knowledge. In addition, to validate the proposed method, we carried out an ablation study and a qualitative study with which we verified that the predictive mechanism helps to improve the quality of the representations.

Más información

Título según WOS: ID WOS:000959706300001 Not found in local WOS DB
Título de la Revista: MACHINE LEARNING AND KNOWLEDGE EXTRACTION
Volumen: 5
Número: 1
Editorial: MDPI
Fecha de publicación: 2023
Página de inicio: 59
Página final: 77
DOI:

10.3390/make5010005

Notas: ISI