nf-core/viralrecon: Changelog

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

[2.6.0] - 2023-03-23

Credits

Special thanks to the following for their code contributions to the release:

Friederike Hanssen
Hugo Tavares
James Fellows Yates
Jessica Wu
Matthew Wells
Maxime Garcia
Phil Ewels
Sara Monzón

Thank you to everyone else that has contributed by reporting bugs, enhancements or in any other way, shape or form.

Enhancements & fixes

[#297] - Add tube map for pipeline
[#316] - Variant calling isn't run when using --skip_asciigenome with metagenomic data
[#317] - ivar_variants_to_vcf: Ignore lines without annotation in ivar tsv file
[#320] - Pipeline fails at email step: Failed to invoke workflow.onComplete event handler
[#321] - ivar_variants_to_vcf script: Duplicated positions in tsv file due to overlapping annotations
[#334] - Longshot thread 'main' panicked at 'assertion failed: p <= 0.0' error
[#341] - artic/minion and artic/guppyplex: Update module version 1.2.2 -> 1.2.3
[#348] - Document full parameters of iVar consensus
[#349] - ERROR in Script plasmidID
[#356] - Add NEB SARS-CoV-2 primers
[#368] - Incorrect depth from ivar variants reported in variants long table
Updated pipeline template to nf-core/tools 2.7.2
Add tower.yml for Report rendering in Nextflow Tower
Use --skip_plasmidid by default

Parameters

Old parameter	New parameter
`--tracedir`

Software dependencies

Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.

Dependency	Old version	New version
`artic`	1.2.2	1.2.3
`bcftools`	1.51.1	1.16
`blast`	2.12.0	2.13.0
`cutadapt`	3.5	4.2
`ivar`	1.3.1	1.4
`multiqc`	1.13a	1.14
`nanoplot`	1.40.0	1.41.0
`nextclade`	2.2.0	2.12.0
`pangolin`	4.1.1	4.2
`picard`	2.27.4	3.0.0
`samtools`	1.15.1	1.16.1
`spades`	3.15.4	3.15.5

[2.5] - 2022-07-13

Enhancements & fixes

Default Nextclade dataset shipped with the pipeline has been bumped from 2022-01-18T12:00:00Z -> 2022-06-14T12:00:00Z
[#234] - Remove replacement of dashes in sample name with underscores
[#292] - Filter empty FastQ files after adapter trimming
[#303] - New pangolin dbs (4.0.x) not assigning lineages to Sars-CoV-2 samples in MultiQC report correctly
[#304] - Re-factor code of ivar_variants_to_vcf script
[#306] - Add contig field information in vcf header in ivar_variants_to_vcf and use bcftools sort
[#311] - Invalid declaration val medaka_model_string
[#316] - Variant calling isn't run when using --skip_asciigenome with metagenomic data
[nf-core/rnaseq#764] - Test fails when using GCP due to missing tools in the basic biocontainer
Updated pipeline template to nf-core/tools 2.4.1

Software dependencies

Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.

Dependency	Old version	New version
`artic`	1.2.1	1.2.2
`bcftools`	1.14	1.15.1
`multiqc`	1.11	1.13a
`nanoplot`	1.39.0	1.40.0
`nextclade`	1.10.2	2.2.0
`pangolin`	3.1.20	4.1.1
`picard`	2.26.10	2.27.4
`quast`	5.0.2	5.2.0
`samtools`	1.14	1.15.1
`spades`	3.15.3	3.15.4
`vcflib`	1.0.2	1.0.3

Parameters

[2.4.1] - 2022-03-01

Enhancements & fixes

[#288] - --primer_set_version only accepts Integers (incompatible with "4.1" Artic primers set)

[2.4] - 2022-02-22

Enhancements & fixes

nf-core/tools#1415 - Make --outdir a mandatory parameter
[#281] - Nanopore medaka processing fails with error if model name, not model file, provided
[#286] - IVAR_VARIANTS silently failing when FAI index is missing

Parameters

Old parameter	New parameter
	`--publish_dir_mode`

[2.3.1] - 2022-02-15

Enhancements & fixes

[#277] - Misuse of rstrip in make_variants_long_table.py script

Software dependencies

Dependency	Old version	New version
`mosdepth`	0.3.2	0.3.3
`pangolin`	3.1.19	3.1.20

[2.3] - 2022-02-04

⚠️ Major enhancements

Please see Major updates in v2.3 for a more detailed list of changes added in this version.
When using --protocol amplicon, in the previous release, iVar was used for both the variant calling and consensus sequence generation. The pipeline will now perform the variant calling and consensus sequence generation with iVar and BCFTools/BEDTools, respectively.
Bump minimum Nextflow version from 21.04.0 -> 21.10.3

Enhancements & fixes

Port pipeline to the updated Nextflow DSL2 syntax adopted on nf-core/modules
Updated pipeline template to nf-core/tools 2.2
[#209] - Check that contig in primer BED and genome fasta match
[#218] - Support for compressed FastQ files for Nanopore data
[#232] - Remove duplicate variants called by ARTIC ONT pipeline
[#235] - Nextclade version bump
[#244] - Fix BCFtools consensus generation and masking
[#245] - Mpileup file as output
[#246] - Option to generate consensus with BCFTools / BEDTools using iVar variants
[#247] - Add strand-bias filtering option and codon fix in consecutive positions in ivar tsv conversion to vcf
[#248] - New variants reporting table

Parameters

Old parameter	New parameter
	`--nextclade_dataset`
	`--nextclade_dataset_name`
	`--nextclade_dataset_reference`
	`--nextclade_dataset_tag`
	`--skip_consensus_plots`
	`--skip_variants_long_table`
	`--consensus_caller`
`--callers`	`--variant_caller`

Software dependencies

Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.

Dependency	Old version	New version
`bcftools`	1.11	1.14
`blast`	2.10.1	2.12.0
`bowtie2`	2.4.2	2.4.4
`cutadapt`	3.2	3.5
`fastp`	0.20.1	0.23.2
`kraken2`	2.1.1	2.1.2
`minia`	3.2.4	3.2.6
`mosdepth`	0.3.1	0.3.2
`nanoplot`	1.36.1	1.39.0
`nextclade`		1.10.2
`pangolin`	3.1.7	3.1.19
`picard`	2.23.9	2.26.10
`python`	3.8.3	3.9.5
`samtools`	1.10	1.14
`spades`	3.15.2	3.15.3
`tabix`	0.2.6	1.11
`vcflib`		1.0.2

[2.2] - 2021-07-29

Enhancements & fixes

Updated pipeline template to nf-core/tools 2.1
Remove custom content to render Pangolin report in MultiQC as it was officially added as a module in v1.11
[#212] - Access to PYCOQC.out is undefined
[#229] - ARTIC Guppyplex settings for 1200bp ARTIC primers with Nanopore data

Software dependencies

Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.

Dependency	Old version	New version
`multiqc`	1.10.1	1.11
`pangolin`	3.0.5	3.1.7
`samtools`	1.10	1.12

[2.1] - 2021-06-15

Enhancements & fixes

Removed workflow to download data from public databases in favour of using nf-core/fetchngs
Added Pangolin results to MultiQC report
Added warning to MultiQC report for samples that have no reads after adapter trimming
Added docs about structure of data required for running Nanopore data
Added docs about using other primer sets for Illumina data
Added docs about overwriting default container definitions to use latest versions e.g. Pangolin
Dashes and spaces in sample names will be converted to underscores to avoid issues when creating the summary metrics
[#196] - Add mosdepth heatmap to MultiQC report
[#197] - Output a .tsv comprising the Nextclade and Pangolin results for all samples processed
[#198] - ASCIIGenome failing during analysis
[#201] - Conditional include are not expected to work
[#204] - Memory errors for SNP_EFF step

Parameters

Old parameter	New parameter
`--public_data_ids`
`--skip_sra_fastq_download`

Software dependencies

Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.

Dependency	Old version	New version
`nextclade_js`	0.14.2	0.14.4
`pangolin`	2.4.2	3.0.5

[2.0] - 2021-05-13

⚠️ Major enhancements

Pipeline has been re-implemented in Nextflow DSL2
All software containers are now exclusively obtained from Biocontainers
Updated minimum Nextflow version to v21.04.0 (see nextflow#572)
BCFtools and iVar will be run by default for Illumina metagenomics and amplicon data, respectively. However, this behaviour can be customised with the --callers parameter.
Variant graph processes to call variants relative to the reference genome directly from de novo assemblies have been deprecated and removed
Variant calling with Varscan 2 has been deprecated and removed due to licensing restrictions
New tools:
- Pangolin for lineage analysis
- Nextclade for clade assignment, mutation calling and consensus sequence quality checks
- ASCIIGenome for individual variant screenshots with annotation tracks

Other enhancements & fixes

Illumina and Nanopore runs containing the same 48 samples sequenced on both platforms have been uploaded to the nf-core AWS account for full-sized tests on release
Initial implementation of a standardised samplesheet JSON schema to use with user interfaces and for validation
Default human --kraken2_db link has been changed from Zenodo to an AWS S3 bucket for more reliable downloads
Updated pipeline template to nf-core/tools 1.14
Optimise MultiQC configuration and input files for faster run-time on huge sample numbers
[#122] - Single SPAdes command to rule them all
[#138] - Problem masking the consensus sequence
[#142] - Unknown method invocation toBytes on String type
[#169] - ggplot2 error when generating mosdepth amplicon plot with Swift v2 primers
[#170] - ivar trimming of Swift libraries new offset feature
[#175] - MultiQC report does not include all the metrics
[#188] - Add and fix EditorConfig linting in entire pipeline

Parameters

Old parameter	New parameter
`--amplicon_bed`	`--primer_bed`
`--amplicon_fasta`	`--primer_fasta`
`--amplicon_left_suffix`	`--primer_left_suffix`
`--amplicon_right_suffix`	`--primer_right_suffix`
`--filter_dups`	`--filter_duplicates`
`--skip_adapter_trimming`	`--skip_fastp`
`--skip_amplicon_trimming`	`--skip_cutadapt`
	`--artic_minion_aligner`
	`--artic_minion_caller`
	`--artic_minion_medaka_model`
	`--asciigenome_read_depth`
	`--asciigenome_window_size`
	`--blast_db`
	`--enable_conda`
	`--fast5_dir`
	`--fastq_dir`
	`--ivar_trim_offset`
	`--kraken2_assembly_host_filter`
	`--kraken2_variants_host_filter`
	`--min_barcode_reads`
	`--min_guppyplex_reads`
	`--multiqc_title`
	`--platform`
	`--primer_set`
	`--primer_set_version`
	`--public_data_ids`
	`--save_trimmed_fail`
	`--save_unaligned`
	`--sequencing_summary`
	`--singularity_pull_docker_container`
	`--skip_asciigenome`
	`--skip_bandage`
	`--skip_consensus`
	`--skip_ivar_trim`
	`--skip_nanoplot`
	`--skip_pangolin`
	`--skip_pycoqc`
	`--skip_nextclade`
	`--skip_sra_fastq_download`
	`--spades_hmm`
	`--spades_mode`
`--cut_mean_quality`
`--filter_unmapped`
`--ivar_trim_min_len`
`--ivar_trim_min_qual`
`--ivar_trim_window_width`
`--kraken2_use_ftp`
`--max_allele_freq`
`--min_allele_freq`
`--min_base_qual`
`--min_coverage`
`--min_trim_length`
`--minia_kmer`
`--mpileup_depth`
`--name`
`--qualified_quality_phred`
`--save_align_intermeds`
`--save_kraken2_fastq`
`--save_sra_fastq`
`--skip_sra`
`--skip_vg`
`--unqualified_percent_limit`
`--varscan2_strand_filter`

Software dependencies

Note, since the pipeline is now using Nextflow DSL2, each process will be run with its own Biocontainer. This means that on occasion it is entirely possible for the pipeline to be using different versions of the same tool. However, the overall software dependency changes compared to the last release have been listed below for reference.

Dependency	Old version	New version
`artic`		1.2.1
`asciigenome`		1.16.0
`bc`	1.07.1
`bcftools`	1.9	1.11
`bedtools`	2.29.2	2.30.0
`bioconductor-biostrings`	2.54.0	2.58.0
`bioconductor-complexheatmap`	2.2.0	2.6.2
`blast`	2.9.0	2.10.1
`bowtie2`	2.4.1	2.4.2
`cutadapt`	2.10	3.2
`ivar`	1.2.2	1.3.1
`kraken2`	2.0.9beta	2.1.1
`markdown`	3.2.2
`minimap2`	2.17
`mosdepth`	0.2.6	0.3.1
`multiqc`	1.9	1.10.1
`nanoplot`		1.36.1
`nextclade_js`		0.14.2
`pangolin`		2.4.2
`parallel-fastq-dump`	0.6.6
`picard`	2.23.0	2.23.9
`pigz`	2.3.4
`plasmidid`	1.6.3	1.6.4
`pycoqc`		2.5.2
`pygments`	2.6.1
`pymdown-extensions`	7.1
`python`	3.6.10	3.8.3
`r-base`	3.6.2	4.0.3
`r-ggplot2`	3.3.1	3.3.3
`r-tidyr`	1.1.0
`requests`		2.24.0
`samtools`	1.9	1.10
`seqwish`	0.4.1
`snpeff`	4.5covid19	5.0
`spades`	3.14.0	3.15.2
`sra-tools`	2.10.7
`tabix`		0.2.6
`unicycler`	0.4.7	0.4.8
`varscan`	2.4.4
`vg`	1.24.0

[1.1.0] - 2020-06-23

Added

#112 - Per-amplicon coverage plot
#124 - Intersect variants across callers
nf-core/tools#616 - Updated GitHub Actions to build Docker image and push to Docker Hub
Parameters:
- --min_mapped_reads to circumvent failures for samples with low number of mapped reads
- --varscan2_strand_filter to toggle the default Varscan 2 strand filter
- --skip_mosdepth - skip genome-wide and amplicon coverage plot generation from mosdepth output
- --amplicon_left_suffix - to provide left primer suffix used in name field of --amplicon_bed
- --amplicon_right_suffix - to provide right primer suffix used in name field of --amplicon_bed
- Unify parameter specification with COG-UK pipeline:
  - --min_allele_freq - minimum allele frequency threshold for calling variants
  - --mpileup_depth - SAMTools mpileup max per-file depth
  - --ivar_exclude_reads renamed to --ivar_trim_noprimer
  - --ivar_trim_min_len - minimum length of read to retain after primer trimming
  - --ivar_trim_min_qual - minimum quality threshold for sliding window to pass
  - --ivar_trim_window_width - width of sliding window
[#118] Updated GitHub Actions AWS workflow for small and full size tests.

Removed

--skip_qc parameter

Dependencies

Add mosdepth 0.2.6
Add bioconductor-complexheatmap 2.2.0
Add bioconductor-biostrings 2.54.0
Add r-optparse 1.6.6
Add r-tidyr 1.1.0
Add r-tidyverse 1.3.0
Add r-ggplot2 3.3.1
Add r-reshape2 1.4.4
Add r-viridis 0.5.1
Update sra-tools 2.10.3 -> 2.10.7
Update bowtie2 2.3.5.1 -> 2.4.1
Update picard 2.22.8 -> 2.23.0
Update minia 3.2.3 -> 3.2.4
Update plasmidid 1.5.2 -> 1.6.3

[1.0.0] - 2020-06-01

Initial release of nf-core/viralrecon, created with the nf-core template.

This pipeline is a re-implementation of the SARS_Cov2_consensus-nf and SARS_Cov2_assembly-nf pipelines initially developed by Sarai Varona and Sara Monzon from BU-ISCIII. Porting both of these pipelines to nf-core was an international collaboration between numerous contributors and developers, led by Harshil Patel from the The Bioinformatics & Biostatistics Group at The Francis Crick Institute, London. We appreciated the need to have a portable, reproducible and scalable pipeline for the analysis of COVID-19 sequencing samples and so the Avengers Assembled!

Pipeline summary

Download samples via SRA, ENA or GEO ids (ENA FTP, parallel-fastq-dump; if required)
Merge re-sequenced FastQ files (cat; if required)
Read QC (FastQC)
Adapter trimming (fastp)
Variant calling
1. Read alignment (Bowtie 2)
2. Sort and index alignments (SAMtools)
3. Primer sequence removal (iVar; amplicon data only)
4. Duplicate read marking (picard; removal optional)
5. Alignment-level QC (picard, SAMtools)
6. Choice of multiple variant calling and consensus sequence generation routes (VarScan 2, BCFTools, BEDTools || iVar variants and consensus || BCFTools, BEDTools)
  - Variant annotation (SnpEff, SnpSift)
  - Consensus assessment report (QUAST)
De novo assembly
1. Primer trimming (Cutadapt; amplicon data only)
2. Removal of host reads (Kraken 2)
3. Choice of multiple assembly tools (SPAdes || metaSPAdes || Unicycler || minia)
  - Blast to reference genome (blastn)
  - Contiguate assembly (ABACAS)
  - Assembly report (PlasmidID)
  - Assembly assessment report (QUAST)
  - Call variants relative to reference (Minimap2, seqwish, vg, Bandage)
  - Variant annotation (SnpEff, SnpSift)
Present QC and visualisation for raw read, alignment, assembly and variant calling results (MultiQC)

Files

CHANGELOG.md

Latest commit

History

CHANGELOG.md

File metadata and controls

nf-core/viralrecon: Changelog

[2.6.0] - 2023-03-23

Credits

Enhancements & fixes

Parameters

Software dependencies

[2.5] - 2022-07-13

Enhancements & fixes

Software dependencies

Parameters

[2.4.1] - 2022-03-01

Enhancements & fixes

[2.4] - 2022-02-22

Enhancements & fixes

Parameters

[2.3.1] - 2022-02-15

Enhancements & fixes

Software dependencies

[2.3] - 2022-02-04

⚠️ Major enhancements

Enhancements & fixes

Parameters

Software dependencies

[2.2] - 2021-07-29

Enhancements & fixes

Software dependencies

[2.1] - 2021-06-15

Enhancements & fixes

Parameters

Software dependencies

[2.0] - 2021-05-13

⚠️ Major enhancements

Other enhancements & fixes

Parameters

Software dependencies

[1.1.0] - 2020-06-23

Added

Removed

Dependencies

[1.0.0] - 2020-06-01

Pipeline summary