forked from s-andrews/nextflow_pipelines
-
Notifications
You must be signed in to change notification settings - Fork 2
/
nf_qc
executable file
·154 lines (111 loc) · 7.47 KB
/
nf_qc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
#!/usr/bin/env nextflow
nextflow.enable.dsl=2
params.outdir = "."
params.genome = ""
params.verbose = false
params.single_end = false // default mode is auto-detect. NOTE: params are handed over automatically
params.fastqc_args = ''
params.fastq_screen_args = ''
params.multiqc_args = ''
params.help = false
// Show help message and exit
if (params.help){
helpMessage()
exit 0
}
if (params.verbose){
println ("[WORKFLOW] FASTQC ARGS: " + params.fastqc_args)
println ("[WORKFLOW] FASTQ SCREEN ARGS ARE: " + params.fastq_screen_args)
println ("[WORKFLOW] MULTIQC ARGS: " + params.multiqc_args)
}
include { makeFilesChannel; getFileBaseNames } from './nf_modules/files.mod.nf'
include { FASTQC } from './nf_modules/fastqc.mod.nf'
include { FASTQ_SCREEN } from './nf_modules/fastq_screen.mod.nf'
include { MULTIQC } from './nf_modules/multiqc.mod.nf'
file_ch = makeFilesChannel(args)
workflow {
main:
FASTQC (file_ch, params.outdir, params.fastqc_args, params.verbose)
FASTQ_SCREEN (file_ch, params.outdir, params.fastq_screen_args, params.verbose)
// merging channels for MultiQC
multiqc_ch = FASTQC.out.report.mix(
FASTQ_SCREEN.out.report.ifEmpty([]),
).collect()
// multiqc_ch.subscribe { println "Got: $it" }
MULTIQC (multiqc_ch, params.outdir, params.multiqc_args, params.verbose)
}
// Since workflows with very long command lines tend to fail to get rendered at all, I was experimenting with a
// minimal execution summary report so we at least know what the working directory was...
workflow.onComplete {
def msg = """\
Pipeline execution summary
---------------------------
Jobname : ${workflow.runName}
Completed at: ${workflow.complete}
Duration : ${workflow.duration}
Success : ${workflow.success}
workDir : ${workflow.workDir}
exit status : ${workflow.exitStatus}
"""
.stripIndent()
sendMail(to: "${workflow.userName}@babraham.ac.uk", subject: 'Minimal pipeline execution report', body: msg)
}
def helpMessage() {
log.info"""
>>
SYNOPSIS:
This workflow takes in a list of filenames (in FastQ format), and runs them through the three QC tools FastQC, FastQ Screen
as well as MultiQC (on the Babraham stone compute cluster).
This QC workflow is ideally used for data of which you don't know too much about, such as data downloaded from a public repository.
Both FastQC and FastQ Screen results are then summarised in one convenient QC report called 'multiqc_report.html'. To add additional
parameters, please consider specifying tool-specific arguments (see --tool_name_args="[str]" below).
==============================================================================================================
USAGE:
nf_qc [options] <input files>
Mandatory arguments:
====================
<input_files> List of input files, e.g. '*fastq.gz' or '*fq.gz'. In theory, files will automatically be
processed as single-end or paired end files (if file pairs share the same base-name, and
differ only by a different read number, e.g. 'base_name_R1.fastq.gz' and 'base_name_R2.fastq.gz'
(or R3, R4). For paired-end files, only Read 1 is run through FastQ Screen (as typically R1
and R2 produce nearly identical contamination profiles). To run all specifed files (i.e. even
Read 2 of paired-end files) through FastQ Screen, please see the option '--single_end' below.
Tool-specific options:
======================
--fastqc_args="[str]" This option can take any number of options that are compatible with FastQC to modify its default
behaviour. For more detailed information on available options please refer to the FastQC documentation,
or run 'fastqc --help' on the command line. As an example, to run FastQC without grouping of bases if
reads are >50bp and use a specific file with non-default adapter sequences, use
' --fastqc_args="--nogroup --adapters ./non_default_adapter_file.txt" '. Please note that the format:
="your options" needs to be strictly adhered to in order to work correctly. [Default: None]
--fastq_screen_args="[str]" This option can take any number of options that are compatible with FastQ Screen to modify its
default mapping behaviour. For more detailed information on available options please refer
to the FastQ Screen documentation, or run 'fastq_screen --help' on the command line. For instance,
to process a bisulfite converted library with fairly relaxed parameters, you could use:
' --fastq_screen_args="--bisulfite --score_min L,0,-0.6" ' (remember that bisfulfite files should
be adapter- and quality trimmed prior to running any alignments). Please note that the format
="your options" needs to be strictly adhered to in order to work correctly.
[Default: None]
Other options:
==============
--outdir [str] Path to the output directory. [Default: current working directory]
--single_end Force files of a read pair to be treated as single-end files. [Default: auto-detect]
--bisulfite Can be used directly to indicate data is Bisulfite-seq.
--verbose More verbose status messages. [Default: OFF]
--help Displays this help message and exits.
Workflow options:
=================
Please note the single '-' hyphen for the following options!
-resume If a pipeline workflow has been interrupted or stopped (e.g. by accidentally closing a laptop),
this option will attempt to resume the workflow at the point it got interrupted by using
Nextflow's caching mechanism. This may save a lot of time.
-bg Sends the entire workflow into the background, thus disconnecting it from the terminal session.
This option launches a daemon process (which will keep running on the headnode) that watches over
your workflow, and submits new jobs to the SLURM queue as required. Use this option for big pipeline
jobs, or whenever you do not want to watch the status progress yourself. Upon completion, the
pipeline will send you an email with the job details. This option is HIGHLY RECOMMENDED!
-process.executor=local Temporarily changes where the workflow is executed to the 'local' machine. See also the nextflow.config
file for more details. [Default: slurm]
<<
""".stripIndent()
}