CodeContrast: A Contrastive Learning Approach for Generating Coherent Programming Exercises

Abstract

Generating high-quality programming exercises with well-aligned problem descriptions, test cases, and code solutions is crucial for computer science education. However, current methods often lack coherence among these components, reducing their educational value. We present CodeContrast, a novel generative model that uses contrastive learning to map programming problems, test cases, and solutions into a shared feature space. By minimizing the distance between matched components and maximizing it for non-matched ones, CodeContrast learns the intricate relationships necessary to generate coherent programming exercises. Our model architecture includes three encoder networks for problem descriptions, test cases, and solutions. During training, CodeContrast processes positive triplets (matching problem, test case, solution) and negative triplets (non-matching combinations) and uses a contrastive loss to position positive triplets close in the feature space while separating negative ones. Comprehensive evaluations of CodeContrast-through automatic metrics, expert ratings, and student studies-demonstrate its effectiveness. Results show high code correctness (92.3% of test cases passed), strong problem-solution alignment (BLEU score up to 0.826), and robust test case coverage (85.7% statement coverage). Expert feedback and student performance further support the pedagogical value of these generated exercises, with students performing comparably to those using manually curated content. CodeContrast advances the automated generation of high-quality programming exercises, capturing relationships among programming components to enhance educational content and improve the learning experience for students and instructors.

Más información

Título según WOS: ID WOS:001405140100001 Not found in local WOS DB
Título de la Revista: EDUCATION SCIENCES
Volumen: 15
Número: 1
Editorial: MDPI
Fecha de publicación: 2025
DOI:

10.3390/educsci15010080

Notas: ISI