Classification of conformational stability of protein mutants from 2D graph representation of protein sequences using support vector machines

Fernandez, M. ; Caballero, J; Fernández L.; Abreau, JI; Acostas, G

Abstract

Euclidean distance counts derived from the protein 2D graphs were used for encoding protein structural information. A total of 35 amino acid 2D distance count (AA2DC) descriptors were calculated from the Euclidean distance matrices (EDM) derived from the 2D graphs at distances ranging from 0.05 to 1.8 units with a lag of 0.05 units. AA2DC descriptors were tested for building predictive classification model of the signs of the change of thermal unfolding Gibbs free energy change (G) of a large data set of 2048 single point mutations on 64 proteins. A support vector machine (SVM) classifier with a Radial Basis Function kernel was implemented for classifying the conformational stability of protein mutants. Temperature and pH of the G experimental measurements were also conveniently used for SVM training in addition to calculated AA2DC descriptors. The optimum SVM model correctly predicted about 72% of G signs in crossvalidation test for all the dataset and also for stable and unstable mutant separately. To the best of our knowledge, this level of accuracy for stable mutant recognition is the highest ever reported for a predictor using sequence information. Furthermore, the classifier adequately recognized unstable mutants of human prion protein and human transthyretin associated to diseases.

Más información

Título según WOS: Classification of conformational stability of protein mutants from 2D graph representation of protein sequences using support vector machines
Título según SCOPUS: Classification of conformational stability of protein mutants from 2D graph representation of protein sequences using support vector machines
Título de la Revista: MOLECULAR SIMULATION
Volumen: 33
Número: 11
Editorial: TAYLOR & FRANCIS LTD
Fecha de publicación: 2007
Página de inicio: 889
Página final: 896
Idioma: English
URL: http://www.tandfonline.com/doi/abs/10.1080/08927020701377070
DOI:

10.1080/08927020701377070

Notas: ISI, SCOPUS