-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bwa-mem2-idx #6558
Open
duartetorreserick
wants to merge
1
commit into
galaxyproject:main
Choose a base branch
from
duartetorreserick:bwa-mem2-indexer
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Add bwa-mem2-idx #6558
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
<tool id="bwa_mem2_idx" name="BWA-MEM2-INDEX" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> | ||
<description>- creates indexes</description> | ||
<macros> | ||
<import>macros-index.xml</import> | ||
</macros> | ||
<expand macro="requirements"/> | ||
<command><![CDATA[ | ||
|
||
## Begin BWA-MEM command line | ||
echo "Index File runnin" > '$output' && | ||
mkdir '$output.files_path' && | ||
cd '$output.files_path' && | ||
bwa-mem2 index -p 'index' '${input_fasta}' | ||
]]></command> | ||
|
||
<inputs> | ||
<param name="input_fasta" type="data" label="Select a genome to index" help="Build an index for this FASTA sequence." format="fasta,fasta.gz"/> | ||
</inputs> | ||
|
||
<outputs> | ||
<data name="output" format="text" label="Test Testov"/> | ||
</outputs> | ||
|
||
<help><![CDATA[ | ||
**What is does** | ||
BWA-MEM2 is the new version of the bwa-mem algorithm in bwa. It produces alignment identical to bwa and is ~1.3-3.1x faster depending on the use-case, dataset and the running machine. | ||
The algorithm is robust to sequencing errors and applicable to a wide range of sequence lengths from 70bp to a few megabases. | ||
|
||
The Galaxy implementation takes fastq files as input and produces output in BAM format, which can be further processed using various BAM utilities exiting in Galaxy (BAMTools, SAMTools, Picard). | ||
|
||
----- | ||
|
||
**Indices: Selecting reference genomes for BWA** | ||
|
||
Galaxy wrapper for BWA allows you select between precomputed and user-defined indices for reference genomes using **Will you select a reference genome from your history or use a built-in index?** flag. This flag has two options: | ||
|
||
1. **Use a built-in genome index** - when selected (this is default), Galaxy provides the user with **Select reference genome index** dropdown. Genomes listed in this dropdown have been pre-indexed with bwa index utility and are ready to be mapped against. | ||
2. **Use a genome from the history and build index** - when selected, Galaxy provides the user with **Select reference genome sequence** dropdown. This dropdown is populated by all FASTA formatted files listed in your current history. If your genome of interest is uploaded into history it will be shown there. Selecting a genome from this dropdown will cause Galaxy to first transparently index it using `bwa index` command, and then run mapping with `bwa mem`. | ||
|
||
If your genome of interest is not listed here you have two choices: | ||
|
||
1. Contact galaxy team using **Help->Support** link at the top of the interface and let us know that an index needs to be added | ||
2. Upload your genome of interest as a FASTA file to Galaxy history and selected **Use a genome from the history and build index** option. | ||
|
||
----- | ||
|
||
**Galaxy-specific option** | ||
|
||
Galaxy allows four levels of control over bwa-mem options provided by **Select analysis mode** menu option. These are: | ||
|
||
1. *Simple Illumina mode*: The simplest possible bwa mem application in which it alignes single or paired-end data to reference using default parameters. It is equivalent to the following command: bwa mem <reference index> <fastq dataset1> [fastq dataset2] | ||
2. *PacBio mode*: The mode adjusted specifically for mapping of long PacBio subreads. Equivalent to the following command: bwa mem -k17 -W40 -r10 -A1 -B1 -O1 -E1 -L0 <reference index> <PacBio dataset in fastq format> | ||
3. *Full list of options*: Allows access to all options through Galaxy interface. | ||
|
||
----- | ||
|
||
**Bam sorting mode** | ||
|
||
The generated bam files can be sorted according to three criteria: coordinates, names and input order. | ||
|
||
In coordinate sorted mode the reads are sorted by coordinates. It means that the reads from the beginning of the first chromosome are first in the file. | ||
|
||
When sorted by read name, the file is sorted by the reference ID (i.e., the QNAME field). | ||
|
||
Finally, the *No sorted (sorted as input)* option yield a BAM file in which the records are sorted in an order corresponding to the order of the reads in the original input file. This option requires using a single thread to perform the conversion from SAM to BAM format, so the runtime is extended. | ||
|
||
|
||
@RG@ | ||
|
||
@info@ | ||
]]></help> | ||
</tool> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
<macros> | ||
<xml name="requirements"> | ||
<requirements> | ||
<requirement type="package" version="@TOOL_VERSION@">bwa-mem2</requirement> | ||
</requirements> | ||
</xml> | ||
|
||
|
||
|
||
<token name="@TOOL_VERSION@">2.2.1</token> | ||
<token name="@VERSION_SUFFIX@">0</token> | ||
<token name="@PROFILE@">21.05</token> | ||
</macros> |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you create a subclass of the directory datatype for this in https://github.com/galaxyproject/galaxy/blob/dev/lib/galaxy/config/sample/datatypes_conf.xml.sample?
Something like
Then you can move the index directory to
$output.files_path
, and in the same way the bwa_mem2 tool can consume the index directory from ``$output.extra_files_path`