Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update falco to v1.2.4 #6377

Merged
merged 8 commits into from
Sep 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
131 changes: 95 additions & 36 deletions tools/falco/falco.xml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
<tool id="falco" name="Falco" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="21.05">
<description>An alternative, more performant implementation of FastQC for high throughput sequence quality control</description>
<macros>
<token name="@TOOL_VERSION@">1.2.3</token>
<token name="@TOOL_VERSION@">1.2.4</token>
<token name="@VERSION_SUFFIX@">0</token>
</macros>
<xrefs>
Expand Down Expand Up @@ -74,28 +74,81 @@
</data>
</outputs>
<tests>
<!-- Test with fastq input -->
<test expect_num_outputs="2">
<param name="input_file" value="1000trimmed.fastq"/>
<output name="html_file" file="fastqc_report.html" ftype="html" lines_diff="2"/>
<output name="text_file" file="fastqc_data.txt" ftype="txt"/>
<output name="html_file" ftype="html">
<assert_contents>
<has_line_matching expression="&lt;html&gt;&lt;head&gt;.+&lt;title&gt; 1000trimmed_fastq - report.+"/>
</assert_contents>
</output>
<!-- two lines diff to allow for reported version to change -->
<output name="text_file" file="fastqc_data.txt" ftype="txt" lines_diff="2"/>
</test>
<!-- Test with fastq.gz input -->
<test expect_num_outputs="2">
<param name="input_file" value="1000trimmed.fastq.gz"/>
<output name="html_file" ftype="html">
<assert_contents>
<has_line_matching expression="&lt;html&gt;&lt;head&gt;.+&lt;title&gt; 1000trimmed_fastq_gz - report.+"/>
</assert_contents>
</output>
<!-- four lines diff to allow for reported version to change; two more to accomodate changed input file name -->
<output name="text_file" file="fastqc_data.txt" ftype="txt" lines_diff="4"/>
</test>
<!-- Test with BAM input -->
<test expect_num_outputs="2">
<param name="input_file" value="hisat_output_1.bam"/>
<output name="html_file" ftype="html">
<assert_contents>
<has_line_matching expression="&lt;html&gt;&lt;head&gt;.+&lt;title&gt; hisat_output_1_bam - report.+"/>
</assert_contents>
</output>
<!-- four lines diff to allow for reported version to change; two more to accomodate changed input file name -->
<output name="text_file" file="fastqc_data_hisat.txt" ftype="txt" lines_diff="4"/>
</test>
<!-- Test summary file option -->
<test expect_num_outputs="3">
<param name="input_file" value="1000trimmed.fastq"/>
<param name="generate_summary" value="true"/>
<output name="html_file" ftype="html">
<assert_contents>
<has_line_matching expression="&lt;html&gt;&lt;head&gt;.+&lt;title&gt; 1000trimmed_fastq - report.+"/>
</assert_contents>
</output>
<output name="text_file" file="fastqc_data.txt" ftype="txt" lines_diff="2"/>
<output name="summary_file" file="fastqc_data_summary.txt" ftype="txt"/>
</test>
<test expect_num_outputs="2">
<param name="input_file" value="1000trimmed.fastq"/>
<param name="contaminants" value="contaminant_list.txt" ftype="tabular"/>
<output name="html_file" file="fastqc_report_contaminants.html" ftype="html" lines_diff="2"/>
<output name="text_file" file="fastqc_data_contaminants.txt" ftype="txt"/>
<output name="html_file" ftype="html">
<assert_contents>
<has_line_matching expression="&lt;html&gt;&lt;head&gt;.+&lt;title&gt; 1000trimmed_fastq - report.+"/>
</assert_contents>
</output>
<output name="text_file" file="fastqc_data_contaminants.txt" ftype="txt" lines_diff="2"/>
</test>
<test expect_num_outputs="2">
<param name="input_file" value="1000trimmed.fastq"/>
<param name="adapters" value="adapter_list.txt" ftype="tabular"/>
<output name="html_file" file="fastqc_report_adapters.html" ftype="html" lines_diff="2"/>
<output name="text_file" file="fastqc_data_adapters.txt" ftype="txt"/>
<output name="html_file" ftype="html">
<assert_contents>
<has_line_matching expression="&lt;html&gt;&lt;head&gt;.+&lt;title&gt; 1000trimmed_fastq - report.+"/>
</assert_contents>
</output>
<output name="text_file" file="fastqc_data_adapters.txt" ftype="txt" lines_diff="2"/>
</test>
<test expect_num_outputs="2">
<test expect_num_outputs="3">
<param name="input_file" value="1000trimmed.fastq"/>
<param name="limits" value="limits.txt" ftype="txt"/>
<output name="html_file" file="fastqc_report_customlimits.html" ftype="html" lines_diff="2"/>
<output name="text_file" file="fastqc_data_customlimits.txt" ftype="txt"/>
<param name="generate_summary" value="true"/>
<output name="html_file" ftype="html">
<assert_contents>
<has_line_matching expression="&lt;html&gt;&lt;head&gt;.+&lt;title&gt; 1000trimmed_fastq - report.+"/>
</assert_contents>
</output>
<output name="summary_file" file="fastqc_data_customlimits_summary.txt" ftype="txt"/>
</test>
<!-- ## The kmers param is ignored in Falco and always set to 7. If this ever gets reconsidered, this test could be uncommented.
<test expect_num_outputs="2">
Expand All @@ -116,40 +169,49 @@
<output name="html_file" file="fastqc_report_min_length.html" ftype="html" lines_diff="2"/>
<output name="text_file" file="fastqc_data_min_length.txt" ftype="txt"/>
</test> -->
<test expect_num_outputs="3">
<test expect_num_outputs="2">
<param name="input_file" value="1000trimmed.fastq" ftype="fastq"/>
<param name="nogroup" value="--nogroup"/>
<param name="generate_summary" value="true"/>
<output name="html_file" file="fastqc_report_nogroup.html" ftype="html" lines_diff="2"/>
<output name="text_file" file="fastqc_data_nogroup.txt" ftype="txt"/>
<output name="summary_file" file="fastqc_data_nogroup_summary.txt" ftype="txt"/>
<assert_command>
<has_text text="--nogroup"/>
</assert_command>
<output name="html_file" ftype="html">
<assert_contents>
<has_line_matching expression="&lt;html&gt;&lt;head&gt;.+&lt;title&gt; 1000trimmed_fastq - report.+"/>
</assert_contents>
</output>
<output name="text_file" file="fastqc_data_nogroup.txt" ftype="txt" lines_diff="2"/>
</test>
<test expect_num_outputs="3">
<param name="input_file" value="1000trimmed.fastq"/>
<param name="subsample" value="10"/>
<param name="generate_summary" value="true"/>
<output name="html_file" file="fastqc_report_subsample.html" ftype="html" lines_diff="2"/>
<output name="text_file" file="fastqc_report_subsample.txt" ftype="txt"/>
<output name="html_file" ftype="html">
<assert_contents>
<has_line_matching expression="&lt;html&gt;&lt;head&gt;.+&lt;title&gt; 1000trimmed_fastq - report.+"/>
</assert_contents>
</output>
<output name="text_file" file="fastqc_report_subsample.txt" ftype="txt" lines_diff="2"/>
<output name="summary_file" file="fastqc_report_subsample_summary.txt" ftype="txt"/>
</test>
<test expect_num_outputs="3">
<param name="input_file" value="1000trimmed.fastq"/>
<param name="bisulfite" value="-bisulfite"/>
<param name="generate_summary" value="true"/>
<output name="html_file" file="fastqc_report_bisulfite.html" ftype="html" lines_diff="2"/>
<output name="text_file" file="fastqc_report_bisulfite.txt" ftype="txt"/>
<output name="html_file" ftype="html">
<assert_contents>
<has_line_matching expression="&lt;html&gt;&lt;head&gt;.+&lt;title&gt; 1000trimmed_fastq - report.+"/>
</assert_contents>
</output>
<output name="text_file" file="fastqc_report_bisulfite.txt" ftype="txt" lines_diff="2"/>
<output name="summary_file" file="fastqc_report_bisulfite_summary.txt" ftype="txt"/>
</test>
<test expect_num_outputs="3">
<test expect_num_outputs="2">
<param name="input_file" value="1000trimmed.fastq"/>
<param name="reverse_complement" value="-reverse-complement"/>
<param name="generate_summary" value="true"/>
<output name="html_file" file="fastqc_report_reverse_complement.html" ftype="html" lines_diff="2"/>
<output name="text_file" file="fastqc_report_reverse_complement.txt" ftype="txt"/>
<output name="summary_file" file="fastqc_report_reverse_complement_summary.txt" ftype="txt"/>
<output name="html_file" ftype="html">
<assert_contents>
<has_line_matching expression="&lt;html&gt;&lt;head&gt;.+&lt;title&gt; 1000trimmed_fastq - report.+"/>
</assert_contents>
</output>
<output name="text_file" file="fastqc_report_reverse_complement.txt" ftype="txt" lines_diff="2"/>
</test>
</tests>
<help><![CDATA[
Expand All @@ -159,15 +221,15 @@ Falco_ is a high-speed emulation of the popular FastQC software for quality cont

💚️ With its superior performance Falco saves computational resources and gives you back results faster than FastQC.

We recommend it for most use cases (but see below for exceptions). 💚️
We recommend it for most use cases (but see below for rare exceptions). 💚️

The main functions of Falco are very similar to those of FastQC:

- Import of data from BAM, SAM or FastQ/FastQ.gz files (any variant),
- Providing a quick overview to tell you in which areas there may be problems
- Summary graphs and tables to quickly assess your data
- Export of results to an HTML based permanent report
- Offline operation to allow automated generation of reports without running the interactive application
- Export of results to an HTML-based report


.. class:: infomark

Expand All @@ -181,13 +243,10 @@ The plain text report generated by Falco can be used as a "FastQC" report in Mul

Falco doesn't currently support fastq.bz2 as input format meaning Galaxy has to perform a relatively slow format conversion before running the tool, which together makes the analysis slower than with FastQC.

- you are interested in PolyA and PolyG statistics in the Adapter Content section of the quality report

Falco doesn't currently calculate statistics for these "Adapters" by default.

- your input consists of *mapped* reads in SAM/BAM format
- you need the HTML report to be viewable offline

Due to a bug in the current version of Falco, reads mapped to the reverse strand of the reference genome are not handled correctly and reported metrics are wrong!
The current version of Falco relies on plotly to generate the graphs in the HTML report dynamically each time it's viewed.
MultiQC plots generated from Falco's raw data output are, of course, viewable offline just like the ones generated from FastQC output.

-----

Expand Down
Binary file removed tools/falco/test-data/1000trimmed.fastq.bz2
Binary file not shown.
Loading