CyanoSeq: a new curated reference database of cyanobacterial 16S rRNA sequences
Start Date
23-5-2022 5:45 PM
End Date
23-5-2022 7:00 PM
Abstract
Since the advent of next generation sequencing methods (NGS), large scale metagenomic studies using targeted genes are the most common approach for studying microbial community diversity. NGS methods provide insights not captured by traditional methods, allowing for better understanding of microbial communities. However, diversity assessments require reliable reference databases for accurate assignment of taxonomic composition. While curated databases exist for bacteria, the Cyanobacteria phylum remains poorly curated within them. The taxonomic rankings of Cyanobacteria provided within bacterial databases can be incorrect as the Cyanobacterial taxonomy within these databases are: not comprehensive, deficient in many well-described cyanobacteria, and do not resolve polyphyletic genera. The combination of uncomprehensive and incorrect taxonomy within current databases can lead to incorrect taxonomic assignment of metagenomic reads causing misinterpretations of the cyanobacterial community. To ameliorate these issues, we propose “CyanoSeq”, a curated database of cyanobacterial 16S rRNA sequences for taxonomic assignment of metagenomic reads. CyanoSeq is assembled from 16S rRNA sequences found within NCBI, with their taxonomies curated from cyanobacterial taxonomic literature as well as a systematic assessment of uncharacterized cyanobacterial sequences. In addition to curated cyanobacterial sequences, CyanoSeq is composed of plastid, bacterial, and Melainabacteria sequences for more robust phylogenetic interpretations.
CyanoSeq: a new curated reference database of cyanobacterial 16S rRNA sequences
Since the advent of next generation sequencing methods (NGS), large scale metagenomic studies using targeted genes are the most common approach for studying microbial community diversity. NGS methods provide insights not captured by traditional methods, allowing for better understanding of microbial communities. However, diversity assessments require reliable reference databases for accurate assignment of taxonomic composition. While curated databases exist for bacteria, the Cyanobacteria phylum remains poorly curated within them. The taxonomic rankings of Cyanobacteria provided within bacterial databases can be incorrect as the Cyanobacterial taxonomy within these databases are: not comprehensive, deficient in many well-described cyanobacteria, and do not resolve polyphyletic genera. The combination of uncomprehensive and incorrect taxonomy within current databases can lead to incorrect taxonomic assignment of metagenomic reads causing misinterpretations of the cyanobacterial community. To ameliorate these issues, we propose “CyanoSeq”, a curated database of cyanobacterial 16S rRNA sequences for taxonomic assignment of metagenomic reads. CyanoSeq is assembled from 16S rRNA sequences found within NCBI, with their taxonomies curated from cyanobacterial taxonomic literature as well as a systematic assessment of uncharacterized cyanobacterial sequences. In addition to curated cyanobacterial sequences, CyanoSeq is composed of plastid, bacterial, and Melainabacteria sequences for more robust phylogenetic interpretations.