Releases: Ecogenomics/GTDBTk
Releases · Ecogenomics/GTDBTk
2.4.0
Bug Fixes:
- (#576) When all genomes fail the prodigal step in the
classify_wf
, The
bac120 summary file is still produced with the all failed genomes listed as 'Unclassified' - (#573) When running the 3 classify steps independently, a genome can be filtered out in the
align
step but still be classified in theidentify
step. To avoid duplication of row, the genome is classified with a warning. - (#540 ) Empty files are skipped during the sketch step of
Mash
,
they are then catched in theprodigal
step and are returned as 'Unclassified' - (#549) :
--force
has been modified to deal with #540.Prodigal
wasn't returning the empty files as failed genomes, it was only skipping them. These genomes are now returned in the summary file and flagged as Unclassified.
Major Changes:
-
FastANI
has been replaced byskani
as the primary tool for computing Average Nucleotide Identity (ANI).Users may notice slight variations in the results compared to those obtained usingFastANI
. -
In the generated
summary.tsv
files, several columns have been renamed for clarity and consistency. The following columns have been affected:- "
fastani_reference
" column has been renamed to "closest_genome_reference
". - "
fastani_reference_radius
" column has been renamed to "closest_genome_reference_radius
". - "
fastani_taxonomy
" column has been renamed to "closest_genome_taxonomy
". - "
fastani_ani
" column has been renamed to "closest_genome_ani
". - "
fastani_af
" column has been renamed to "closest_genome_af
".
- "
These changes have been implemented to improve the readability and understanding of the data within the summary.tsv
files. Users should update their scripts or processes accordingly to reflect these renamed column headers.
2.3.2
2.3.1
2.3.0
Bug Fixes:
- (#508) (#509) If ALL genomes for a specific domain are either filtered out or classified with ANI they are now reported in the summary file.
Minor changes:
- (#491) (#498) Allow GTDB-Tk to show
--help
and-v
withoutGTDBTK_DATA_PATH
being set.- WARNING: This is a breaking change if you are importing GTDB-Tk as a library and importing values from
gtdbtk.config.config
, instead you need to import asfrom gtdbtk.config.common import CONFIG
then access values viaCONFIG.<var>
- WARNING: This is a breaking change if you are importing GTDB-Tk as a library and importing values from
- (#508) Mash distance is changed from 0.1 to 0.15 . This is will increase the number of FastANI comparisons but will cover cases wheere genomes have a larger Mash distance but a small ANI.
- (#497) Add a
convert_to_species
function is GTDB-Tk to replace GCA/GCF ids with their GTDB species name - Add
--db_version
flag tocheck_install
to check the version of previous GTDB-Tk packages.
2.2.6
2.2.6
Bug Fixes:
- (#493) Fix issue with --full-tree flag (related to skipping ANI steps)
Minor changes:
- Change URL for documentation to 'https://ecogenomics.github.io/GTDBTk/installing/index.html'
- Improve portability of the ANI_screen step by regenerating the paths of reference genomes in the current filesystem for mash_db.msh
2.2.5
2.2.5
Bug Fixes:
gtdbtk.json
is now reset when the pipeline is re run and the status ofani_screen
is not 'complete'
Minor changes:
- When using
--genes
, ANI steps are skipped and warnings are raised to the user to
inform them that classification is less accurate. - (#486) Environment variables can be used in GTDBTK_DATA_PATH
is_consistent
function inmash.py
compares only the filenames, not the full paths- Add cutoff arguments to PfamScan ( Thanks @AroneyS for the contribution)
2.2.4
Bug Fixes:
- (#475) If all genomes are classified using ANI, Tk will skip the identify step and align steps
Minor changes:
- Add hidden '--skip_pplacer' flag to skip pplacer step ( useful for debugging)
- Improve documentation
- Convert stage_logger to a Singleton class
- Use existing ANI results if available