Releases: raufs/skDER
Releases · raufs/skDER
v1.1.0
- Introduce ability to specify GTDB release and update to using GTDB R220 as default for when users request to auto-download and include all genomes from a particular genus/species.
- Remove need for symlinking genomes locally, instead fastx index files are now written in the same folder as the input genomes and deleted afterwards.
- Parallelization when computing N50 is done by splitting up number of genomes by the number of CPUs allocated and thus writing to at most X number of files at a time, where X is the number of CPUs. This is to address: #4
v1.0.10
- Minor change, added new argument to use
https://ftp.ncbi.nlm.nih.gov/genomes
instead ofhttps://ftp.ncbi.nih.gov/genomes
in case there are issues with connecting to the latter. This gets passed to ncbi-genome-download's-u
argument.
Full Changelog: v1.0.9...v1.0.10
v1.0.9
- Support for gzipped files added (#4)
- GTDB/NCBI downloaded genomes are now kept in gzip form
- FASTA files ending in *.fas now allowed (#4)
- If local input genomes are provided, default behavior is now to symlink files in the skDER results directory and do indexing for N50 calculation there.
- FASTA confirmation now optional (might paralelize in the future and turn back on as default - but currently iterative) - it can take a while if there are a lot of files.
Full Changelog: v1.0.8...v1.0.9
v1.0.8
- Fix broken GTDB-based downloading feature.
- Polish names for genomic assemblies downloaded based on GTDB species names.
Full Changelog: v1.0.7...v1.0.8
v1.0.7
- Corrected faulty usage of the
-s
option in skani triangle and now set it to the default value. This should now result in the more accurate ANI estimates being used for the dereplication methods as intended. - Updated stats and runtime info for running dynamic/greedy approaches on the Wiki.
- Added new secondary clustering option,
-n
which will report the relation/distance of all genomes in the input set to their nearest representative genome.
Full Changelog: v1.0.6...v1.0.7
v1.0.6
- Mostly just updates to the README & help function.
- Added missing library import statements in
util.py
Full Changelog: v1.0.5...v1.0.6
v1.0.5
v1.0.4
v1.0.2
updates for v.1.0.2
- KEY: Correct overflow issue in C++ code related to integer multiplication in computing scores for dynamic dereplication approach
- Introduce a greedy set cover dereplication approach as an alternate method
- Improve code + documentation organization
- Add test case
Full Changelog: v1.0.1...v1.0.2
v1.0.1
- Create directory with representative genomes in the output directory.
- Add version flag , change input for --genomes argument from accepting a directory to multiple paths to genome files.
- Update Enterococcus dereplication showcasing.
Full Changelog: v1.0...v1.0.1