Classification of conformational stability of protein mutants from 3D pseudo-folding graph representation of protein sequences using support vector machines
Abstract
This work reports a novel 3D pseudofolding graph representation of protein sequences for modeling purposes. Amino acids euclidean distances matrices (EDMs) encode primary structural information. Amino Acid Pseudo-Folding 3D Distances Count (AAp3DC) descriptors, calculated from the EDMs of a large data set of 1363 single protein mutants of 64 proteins, were tested for building a classifier for the signs of the change of thermal unfolding Gibbs free energy change (ΔΔG) upon single mutations. An optimum support vector machine (SVM) with a radial basis function (RBF) kernel well recognized stable and unstable mutants with accuracies over 70% in crossvalidation test. To the best of our knowledge, this result for stable mutant recognition is the highest ever reported for a sequence-based predictor with more than 1000 mutants. Furthermore, the model adequately classified mutations associated to diseases of human prion protein and human transthyretin. © 2007 Wiley-Liss, Inc.
Más información
| Título según WOS: | Classification of conformational stability of protein mutants from 3D pseudo-folding graph representation of protein sequences using support vector machines |
| Título según SCOPUS: | Classification of conformational stability of protein mutants from 3D pseudo-folding graph representation of protein sequences using support vector machines |
| Título de la Revista: | PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS |
| Volumen: | 70 |
| Número: | 1 |
| Editorial: | Wiley |
| Fecha de publicación: | 2008 |
| Página de inicio: | 167 |
| Página final: | 175 |
| Idioma: | English |
| URL: | http://doi.wiley.com/10.1002/prot.21524 |
| DOI: |
10.1002/prot.21524 |
| Notas: | ISI, SCOPUS |