PDI - Resultado de Búsqueda

Keywords: narrative extraction, large language models, narrative evaluation, coherence metrics, LLM-as-a-judge

Abstract

Evaluating the coherence of narrative sequences extracted from large document collections is crucial for applications in information retrieval and knowledge discovery. While mathematical coherence metrics based on embedding similarities provide objective measures, they require substantial computational resources and domain expertise to interpret. We propose using large language models (LLMs) as judges to evaluate narrative coherence, demonstrating that their assessments correlate with mathematical coherence metrics. Through experiments on two data sets-news articles about Cuban protests and scientific papers from visualization conferences-we show that the LLM judges achieve Pearson correlations up to 0.65 with mathematical coherence while maintaining high inter-rater reliability (ICC > 0.92). The simplest evaluation approach achieves a comparable performance to the more complex approaches, even outperforming them for focused data sets while achieving over 90% of their performance for the more diverse data sets while using less computational resources. Our findings indicate that LLM-as-a-judge approaches are effective as a proxy for mathematical coherence in the context of narrative extraction evaluation.

Más información

Título según WOS:	LLM-as-a-Judge Approaches as Proxies for Mathematical Coherence in Narrative Extraction
Título de la Revista:	ELECTRONICS
Volumen:	14
Número:	13
Editorial:	MDPI
Fecha de publicación:	2025
Idioma:	English
DOI:	10.3390/electronics14132735
Notas:	ISI

LLM-as-a-Judge Approaches as Proxies for Mathematical Coherence in Narrative Extraction

Abstract

Más información