A Multiobjective Evolutionary Algorithm for Colon Cancer Biomarkers Identification on Gene Expression Data

Cubillos-Chaparro, Jorge; Dorn, Marcio; Villalobos-Cid, Manuel; Inostroza-Ponta, Mario

Abstract

This paper proposes a multiobjective approach for identifying colon cancer biomarkers by selecting gene features from expression datasets. The method aims to simultaneously minimize the number of features selected and maximize class discrimination capacity. The proposal uses a two-stage hybrid pipeline involving a filter and a wrapper stages. It uses a modified NSGA-II with a novel Zoom operator. We ran computational experiments on eight curated colon cancer datasets. The results show that the proposed method achieves an AUC-ROC of 0.989 for 489 genes, while the Zoom operator achieves an AUC-ROC of 1 for 1884 genes. Further biological validation on the Disgenet platform revealed promising associations, uncovering probable biomarkers not previously linked to colon cancer and highlighting their diverse nature, including genes, non-coding sequences, pseudogenes, and discontinued genes. Additionally, a biological enrichment process identified terms related to immune response and glucuronosyltransferase activity through Gene Ontology, with statistical significance concerning colon cancer validated in the literature.

Más información

Título según SCOPUS: ID SCOPUS_ID:85207524173 Not found in local SCOPUS DB
Fecha de publicación: 2024
DOI:

10.1109/CIBCB58642.2024.10702102

Notas: SCOPUS