Skip to content

Commit

Permalink
Add biotools xref to MAFFT (#1342)
Browse files Browse the repository at this point in the history
* Update macros.xml

* Update mafft.xml

* Update mafft-add.xml

* Increment version suffix

* Update mafft-add.xml

* Update mafft.xml

* fix bug in test3

* adapt test3 to multithreading
  • Loading branch information
tuncK authored Oct 31, 2023
1 parent 7dca92b commit 1570f3a
Show file tree
Hide file tree
Showing 3 changed files with 43 additions and 25 deletions.
8 changes: 7 additions & 1 deletion tools/mafft/macros.xml
Original file line number Diff line number Diff line change
@@ -1,7 +1,13 @@
<?xml version="1.0"?>
<macros>
<token name="@VERSION@">0</token>
<token name="@TOOL_VERSION@">7.508</token>
<token name="@VERSION_SUFFIX@">1</token>
<token name="@PROFILE@">22.01</token>
<xml name="biotools">
<xrefs>
<xref type="bio.tools">MAFFT</xref>
</xrefs>
</xml>
<xml name="requirements">
<requirements>
<requirement type="package" version="@TOOL_VERSION@">mafft</requirement>
Expand Down
5 changes: 3 additions & 2 deletions tools/mafft/mafft-add.xml
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
<?xml version="1.0" encoding="UTF-8"?>
<tool id="rbc_mafft_add" name="MAFFT add" version="@TOOL_VERSION@+galaxy@VERSION@">
<tool id="rbc_mafft_add" name="MAFFT add" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@">
<description>Align a sequence,alignment or fragments to an existing alignment.</description>
<macros>
<import>macros.xml</import>
</macros>
<expand macro="requirements" />
<expand macro="biotools"/>
<expand macro="requirements"/>
<stdio>
<exit_code range="1:" level="fatal" description="Error occurred. Please check Tool Standard Error" />
<exit_code range=":-1" level="fatal" description="Error occurred. Please check Tool Standard Error" />
Expand Down
55 changes: 33 additions & 22 deletions tools/mafft/mafft.xml
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
<?xml version="1.0" encoding="UTF-8"?>
<tool id="rbc_mafft" name="MAFFT" version="@TOOL_VERSION@+galaxy@VERSION@">
<description>Multiple alignment program for amino acid or nucleotide sequences</description>
<tool id="rbc_mafft" name="MAFFT" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@">
<description>Multiple alignment program for amino acid or nucleotide sequences</description>
<macros>
<import>macros.xml</import>
</macros>
<expand macro="biotools"/>
<expand macro="requirements" />
<stdio>
<exit_code range="1:" level="fatal" description="Error occurred. Please check Tool Standard Error" />
Expand Down Expand Up @@ -229,32 +230,51 @@
<param name="outputFormat" value="--clustalout"/>
<output name="outputAlignment" ftype="clustal" file="mafft_nwns_result.aln" lines_diff="2" />
</test>
<!-- WARNING: the results of the following test depends on #threads.
The result seems deterministic for single threaded execution, i.e. GALAXY_SLOTS=1 planemo test
However, GH CI/CD uses 2 threads and results vary -->
<test expect_num_outputs="1" >
<param name="inputSequences" value="sample.fa"/>
<param name="flavourType" value="custom"/>
<param name="matrix_condition" value="BLOSUM"/>
<conditional name="matrix_condition">
<param name="matrix" value="BLOSUM"/>
</conditional>
<param name="BLOSUM" value="62"/>
<param name="distance_method" value="--fastapair"/>
<param name="weighti" value="2.7"/>
<param name="iterations" value="1000"/>
<param name="outputFormat" value="--clustalout"/>
<output name="outputAlignment" ftype="clustal" file="mafft_custom_result.aln" lines_diff="2" />
<output name="outputAlignment" ftype="clustal" file="mafft_custom_result.aln" compare="sim_size">
<assert_contents>
<has_n_lines n="458" delta="0"/>
<has_text text="CLUSTAL format alignment by MAFFT F-INS-i"/>
<has_text text="NPIVYGISHPKY"/>
<has_text text="1=="/>
<has_text text="36=="/>
<has_line line="8=opsin, ------------------------------------------------------------"/>
</assert_contents>
</output>
</test>
</tests>
<help> <![CDATA[
**What it does**
MAFFT is a multiple sequence alignment program for unix-like operating systems.
It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of <∼200 sequences),
FFT-NS-2 (fast; for alignment of <∼30,000 sequences), etc.
From the MAFFT man page, an overview of the different predefined flavours of the tool.
From the MAFFT man page, an overview of the different predefined flavours of the tool is as follows:
**Accuracy-oriented methods:**
- L-INS-i (probably most accurate; recommended for <200 sequences; iterative refinement method incorporating local pairwise alignment information):
- mafft --localpair --maxiterate 1000 input [> output]
- G-INS-i (suitable for sequences of similar lengths; recommended for <200 sequences; iterative refinement method incorporating global pairwise alignment information):
- mafft --globalpair --maxiterate 1000 input [> output]
- E-INS-i (suitable for sequences containing large unalignable regions; recommended for <200 sequences):
- mafft --ep 0 --genafpair --maxiterate 1000 input [> output]. For E-INS-i, the --ep 0 option is recommended to allow large gaps.
**Speed-oriented methods:**
- FFT-NS-i (iterative refinement method; two cycles only):
- mafft --retree 2 --maxiterate 2 input [> output]
- FFT-NS-i (iterative refinement method; max. 1000 iterations):
Expand All @@ -271,23 +291,14 @@
- mafft --retree 1 --maxiterate 0 --nofft --parttree input [> output]
**Options:**
--auto
Automatically selects an appropriate strategy from L-INS-i, FFT-NS-i and FFT-NS-2, according to data size. Default: off (always FFT-NS-2)
--adjustdirection
Generate reverse complement sequences, as necessary, and align them together with the remaining sequences. In the case of protein alignment, these options are just ignored.
--op
Gap opening penalty, default: 1.53
--ep
Offset (works like gap extension penalty), default: 0.0
--maxiterate
Maximum number of iterative refinement, default: 0
--clustalout
Output: clustal format, default: fasta
--thread
Number of threads (if unsure, --thread -1)
--retree number
Guide tree is built number times in the progressive stage.
Valid with 6mer distance. Default: 2
- --auto Automatically selects an appropriate strategy from L-INS-i, FFT-NS-i and FFT-NS-2, according to data size. Default: off (always FFT-NS-2)
- --adjustdirection Generate reverse complement sequences, as necessary, and align them together with the remaining sequences. In the case of protein alignment, these options are just ignored.
- --op Gap opening penalty, default: 1.53
- --ep Offset (works like gap extension penalty), default: 0.0
- --maxiterate Maximum number of iterative refinement, default: 0
- --clustalout Output: clustal format, default: fasta
- --retree number Guide tree is built number times in the progressive stage. Valid with 6mer distance. Default: 2
]]>
</help>
<expand macro="citations" />
Expand Down

0 comments on commit 1570f3a

Please sign in to comment.