Original title: ganon2: up-to-date and scalable metagenomics analysis
Authors: Vitor C. Piro,Knut Reinert
In this article, the authors discuss the challenges associated with the fast growth of public repositories of DNA sequences and their impact on metagenomics applications. While these repositories are growing rapidly, the resources to effectively use them are lagging behind. This poses a problem for current methods used in metagenomics analysis, which struggle to handle the massive amount of data generated.
To address this issue, the authors introduce ganon2, a new sequence classification method that improves performance and usability in metagenomics analysis. This method efficiently indexes large datasets with a small memory footprint, allowing for fast and accurate classification results. By utilizing a novel data structure and various optimizations, the authors were able to achieve classification results that were significantly smaller in size compared to existing methods.
The effectiveness of ganon2 was demonstrated through simulated samples from various studies, including the CAMI 1+2 challenge. In taxonomic binning, ganon2 achieved higher median F1-Scores compared to other methods. Additionally, in profiling, ganon2 showed improvements in F1-Score median while maintaining a balanced error in abundance estimation.
This new tool, ganon2, is not only one of the fastest tools evaluated, but it also enables researchers to utilize larger and more diverse reference sets in their daily microbiome analysis, leading to improved resolution of results. The code is open-source and can be accessed with documentation through the provided GitHub link. The authors declare no competing interests.
Original article: https://www.biorxiv.org/content/10.1101/2023.12.07.570547v1