From 11941c828b13dc019df5e2e55e5b16a79c4e17c4 Mon Sep 17 00:00:00 2001 From: Derrick Wood Date: Thu, 19 Oct 2017 22:11:19 -0400 Subject: [PATCH] More prep for 1.0 --- docs/MANUAL.html | 2 +- docs/MANUAL.markdown | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/MANUAL.html b/docs/MANUAL.html index 13aa65b..4ac90d2 100644 --- a/docs/MANUAL.html +++ b/docs/MANUAL.html @@ -346,7 +346,7 @@

Kraken Environment Variables

will use /data/kraken_dbs/mainDB to classify sequences.fa.

Upgrading Databases to v0.10+

-

The minimizer ordering in Kraken versions prior to v0.10.0-beta was a simple lexicographical ordering that provided a suboptimal distribution of k-mers within the bins. Ideally, the bin sizes would be uniform, but simple lexicographical ordering creates a bias toward low-complexity minimizers. To resolve this, the ordering is now "scrambled" by XORing all minimizers with a predefined constant to toggle half of each minimizer's bits before sorting. The more evenly distributed bins provide better caching performance, but databases created in this way are not compatible with earlier versions of Kraken. Kraken versions from v0.10.0-beta up to (but not including) v1.0 will support the use of the older databases, but we nonetheless recommend one of the two following options:

+

The minimizer ordering in Kraken versions prior to v0.10.0-beta was a simple lexicographical ordering that provided a suboptimal distribution of k-mers within the bins. Ideally, the bin sizes would be uniform, but simple lexicographical ordering creates a bias toward low-complexity minimizers. To resolve this, the ordering is now "scrambled" by XORing all minimizers with a predefined constant to toggle half of each minimizer's bits before sorting. The more evenly distributed bins provide better caching performance, but databases created in this way are not compatible with earlier versions of Kraken. Kraken versions from v0.10.0-beta up to (and including) v1.0 will support the use of the older databases, but we nonetheless recommend one of the two following options:

  1. Build a new database. This is the preferred option, as a newly-created database will have the latest genomes and NCBI taxonomy information.

  2. Re-sort an existing database. If you have a custom database, you may want to simply reformat the database to provide you with Kraken's increased speed. To do so, you'll need to do the following:

    diff --git a/docs/MANUAL.markdown b/docs/MANUAL.markdown index 8b84a59..1980a3b 100644 --- a/docs/MANUAL.markdown +++ b/docs/MANUAL.markdown @@ -730,7 +730,7 @@ minimizers with a predefined constant to toggle half of each minimizer's bits before sorting. The more evenly distributed bins provide better caching performance, but databases created in this way are not compatible with earlier versions of Kraken. Kraken versions from v0.10.0-beta up to -(but not including) v1.0 will support the use of the older databases, but +(and including) v1.0 will support the use of the older databases, but we nonetheless recommend one of the two following options: 1) Build a new database. This is the preferred option, as a newly-created