Exploring Machine Learning Algorithms and Numerical Representations Strategies to Develop Sequence-Based Predictive Models for Protein Networks
Keywords: machine learning algorithms, Protein language models, Protein networks, Deep learning architectures, Protein discovery
Abstract
Predicting the affinity between two proteins is one of the most relevant challenges in bioinformatics and one of the most useful for biotechnological and pharmaceutical applications. Current prediction methods use the structural information of the interaction complexes. However, predicting the structure of proteins requires enormous computational costs. Machine learning methods emerge as an alternative to this bioinformatics challenge. There are predictive methods for protein affinity based on structural information. However, for linear information, there are no development guidelines for elaborating predictive models, being necessary to explore several alternatives for processing and developing predictive models. This work explores different options for building predictive protein interaction models via deep learning architectures and classical machine learning algorithms, evaluating numerical representation methods and transformation techniques to represent structural complexes using linear information. Six types of predictive tasks related to the affinity and mutational variant evaluations and their effect on the interaction complex were explored. We show that classical machine learning and convolutional network-based methods perform better than graph convolutional network methods for studying mutational variants. In contrast, graph-based methods perform better on affinity problems or association constants, using only the linear information of the protein sequences. Finally, we show an illustrative use case, expose how to use the developed models, discuss the limitations of the explored methods and comment on future development strategies for improving the studied processes.
Más información
Título según SCOPUS: | ID SCOPUS_ID:85164963833 Not found in local SCOPUS DB |
Título de la Revista: | Lecture Notes in Computer Science |
Volumen: | 13956 LNCS |
Editorial: | Springer, Cham |
Fecha de publicación: | 2023 |
Página de inicio: | 231 |
Página final: | 244 |
DOI: |
10.1007/978-3-031-36805-9_16 |
Notas: | SCOPUS |