GSER (a Genome Size Estimator using R): A pipeline for quality assessment of sequenced genome libraries through genome size estimation
Keywords: genome assembly; genome library; genome size estimation; k, mer; quality control
Abstract
The first step in any genome research after obtaining the read data is to perform a due quality control of the sequenced reads. In a de novo genome assembly project, the second step is to estimate two important features, the genome size and 'best k-mer', to start the assembly tests with different de novo assembly software and its parameters. However, the quality control of the sequenced genome libraries as a whole, instead of focusing on the reads only, is frequently overlooked and realized to be important only when the assembly tests did not render the expected results. We have developed GSER, a Genome Size Estimator using R, a pipeline to evaluate the relationship between k-mers and genome size, as a means for quality assessment of the sequenced genome libraries. GSER generates a set of charts that allow the analyst to evaluate the library datasets before starting the assembly. The script which runs the pipeline can be downloaded from http://www.mobilomics.org/GSER/downloads or http://github.com/mobilomics/GSER.
Más información
| Título según SCOPUS: | GSER (a Genome Size Estimator using R): A pipeline for quality assessment of sequenced genome libraries through genome size estimation |
| Título de la Revista: | Interface Focus |
| Volumen: | 11 |
| Número: | 4 |
| Editorial: | Royal Society Publishing |
| Fecha de publicación: | 2021 |
| Idioma: | English |
| DOI: |
10.1098/rsfs.2020.0077 |
| Notas: | SCOPUS |