RCDPeaks: memory-efficient density peaks clustering of long molecular dynamics
Abstract
Motivation: Density Peaks is a widely spread clustering algorithm that has been previously applied to Molecular Dynamics (MD) simulations. Its conception of cluster centers as elements displaying both a high density of neighbors and a large distance to other elements of high density, particularly fits the nature of a geometrical converged MD simulation. Despite its theoretical convenience, implementations of Density Peaks carry a quadratic memory complexity that only permits the analysis of relatively short trajectories. Results: Here, we describe DP+, an exact novel implementation of Density Peaks that drastically reduces the RAM consumption in comparison to the scarcely available alternatives designed for MD. Based on DP+, we developed RCDPeaks, a refined variant of the original Density Peaks algorithm. Through the use of DP+, RCDPeaks was able to cluster a one-million frames trajectory using less than 4.5 GB of RAM, a task that would have taken more than 2 TB and about 3Ã more time with the fastest and less memory-hunger alternative currently available. Other key features of RCDPeaks include the automatic selection of parameters, the screening of center candidates and the geometrical refining of returned clusters.
Más información
| Título según WOS: | RCDPeaks: memory-efficient density peaks clustering of long molecular dynamics |
| Título según SCOPUS: | RCDPeaks: Memory-efficient density peaks clustering of long molecular dynamics |
| Título de la Revista: | Bioinformatics |
| Volumen: | 38 |
| Número: | 7 |
| Editorial: | Oxford University Press |
| Fecha de publicación: | 2022 |
| Página final: | 1869 |
| Idioma: | English |
| DOI: |
10.1093/bioinformatics/btac021 |
| Notas: | ISI, SCOPUS |