-
Notifications
You must be signed in to change notification settings - Fork 27
Output
Cecret produces a lot of files. Most of these files are in the format of {params.outdir}/{process}/{resultant files from that process}
.
Final File Tree after running cecret.nf
cecret
├── aci
│ ├── amplicon_depth.csv
│ ├── amplicon_depth_mqc.png
│ └── amplicon_depth.png
├── aligned
│ ├── SRR13957125.sorted.bam
│ ├── SRR13957125.sorted.bam.bai
│ ├── SRR13957170.sorted.bam
│ ├── SRR13957170.sorted.bam.bai
│ ├── SRR13957177.sorted.bam
│ └── SRR13957177.sorted.bam.bai
├── ampliconclip
│ ├── SRR13957125.primertrim.sorted.bam
│ ├── SRR13957125.primertrim.sorted.bam.bai
│ ├── SRR13957170.primertrim.sorted.bam
│ ├── SRR13957170.primertrim.sorted.bam.bai
│ ├── SRR13957177.primertrim.sorted.bam
│ └── SRR13957177.primertrim.sorted.bam.bai
├── bcftools_variants
│ ├── SRR13957125.bcftools_variants.vcf
│ ├── SRR13957170.bcftools_variants.vcf
│ └── SRR13957177.bcftools_variants.vcf
├── cecret_results.csv
├── cecret_results.txt
├── consensus
│ ├── SRR13957125.consensus.fa
│ ├── SRR13957170.consensus.fa
│ └── SRR13957177.consensus.fa
├── dataset
│ ├── genemap.gff
│ ├── primers.csv
│ ├── qc.json
│ ├── reference.fasta
│ ├── sequences.fasta
│ ├── tag.json
│ ├── tree.json
│ └── virus_properties.json
├── fastp
│ ├── SRR13957125_clean_PE1.fastq.gz
│ ├── SRR13957125_clean_PE2.fastq.gz
│ ├── SRR13957125_fastp.html
│ ├── SRR13957125_fastp.json
│ ├── SRR13957170_clean_PE1.fastq.gz
│ ├── SRR13957170_clean_PE2.fastq.gz
│ ├── SRR13957170_fastp.html
│ ├── SRR13957170_fastp.json
│ ├── SRR13957177_clean_PE1.fastq.gz
│ ├── SRR13957177_clean_PE2.fastq.gz
│ ├── SRR13957177_fastp.html
│ └── SRR13957177_fastp.json
├── fastqc
│ ├── SRR13957125_1_fastqc.html
│ ├── SRR13957125_1_fastqc.zip
│ ├── SRR13957125_2_fastqc.html
│ ├── SRR13957125_2_fastqc.zip
│ ├── SRR13957125_fastq_name.csv
│ ├── SRR13957170_1_fastqc.html
│ ├── SRR13957170_1_fastqc.zip
│ ├── SRR13957170_2_fastqc.html
│ ├── SRR13957170_2_fastqc.zip
│ ├── SRR13957170_fastq_name.csv
│ ├── SRR13957177_1_fastqc.html
│ ├── SRR13957177_1_fastqc.zip
│ ├── SRR13957177_2_fastqc.html
│ ├── SRR13957177_2_fastqc.zip
│ └── SRR13957177_fastq_name.csv
├── freyja
│ ├── aggregated-freyja.png
│ ├── aggregated-freyja.tsv
│ ├── SRR13957125_demix_collapsed_lineages.yml
│ ├── SRR13957125_demix.tsv
│ ├── SRR13957125_depths.tsv
│ ├── SRR13957125_freyja_lineages_mqc.png
│ ├── SRR13957125_freyja_lineages.png
│ ├── SRR13957125_variants.tsv
│ ├── SRR13957170_depths.tsv
│ ├── SRR13957170_variants.tsv
│ ├── SRR13957177_demix_collapsed_lineages.yml
│ ├── SRR13957177_demix.tsv
│ ├── SRR13957177_depths.tsv
│ ├── SRR13957177_freyja_lineages_mqc.png
│ ├── SRR13957177_freyja_lineages.png
│ └── SRR13957177_variants.tsv
├── heatcluster
│ ├── sorted_matrix.csv
│ ├── heatcluster_mqc.png
│ └── heatcluster.png
├── igv_reports
│ ├── SRR13957125_igvjs_viewer.html
│ ├── SRR13957170_igvjs_viewer.html
│ └── SRR13957177_igvjs_viewer.html
├── iqtree2
│ ├── iqtree2.iqtree
│ ├── iqtree2.log
│ ├── iqtree2.mldist
│ ├── iqtree2.treefile
│ └── iqtree2.treefile.nwk
├── ivar_consensus
│ ├── SRR13957125.consensus.fa
│ ├── SRR13957125.consensus.qual.txt
│ ├── SRR13957170.consensus.fa
│ ├── SRR13957170.consensus.qual.txt
│ ├── SRR13957177.consensus.fa
│ └── SRR13957177.consensus.qual.txt
├── ivar_trim
│ ├── SRR13957125_ivar.log
│ ├── SRR13957125.primertrim.sorted.bam
│ ├── SRR13957125.primertrim.sorted.bam.bai
│ ├── SRR13957170_ivar.log
│ ├── SRR13957170.primertrim.sorted.bam
│ ├── SRR13957170.primertrim.sorted.bam.bai
│ ├── SRR13957177_ivar.log
│ ├── SRR13957177.primertrim.sorted.bam
│ └── SRR13957177.primertrim.sorted.bam.bai
├── ivar_variants
│ ├── SRR13957125.ivar_variants.vcf
│ ├── SRR13957125.variants.tsv
│ ├── SRR13957170.ivar_variants.vcf
│ ├── SRR13957170.variants.tsv
│ ├── SRR13957177.ivar_variants.vcf
│ └── SRR13957177.variants.tsv
├── logs
│ └── <Log files for processes not included in tree for brevity>
├── mafft
│ └── mafft_aligned.fasta
├── multiqc
│ ├── multiqc_data
│ │ ├── multiqc_citations.txt
│ │ ├── multiqc_data.json
│ │ ├── multiqc_fastqc.txt
│ │ ├── multiqc_general_stats.txt
│ │ ├── multiqc.log
│ │ ├── multiqc_nextclade.txt
│ │ ├── multiqc_pangolin.txt
│ │ ├── multiqc_samtools_flagstat.txt
│ │ ├── multiqc_samtools_stats.txt
│ │ ├── multiqc_seqyclean.txt
│ │ ├── multiqc_software_versions.txt
│ │ └── multiqc_sources.txt
│ └── multiqc_report.html
├── nextalign
│ ├── nextalign.aligned.fasta
│ ├── nextalign.errors.csv
│ ├── nextalign_gene_E.translation.fasta
│ ├── nextalign_gene_M.translation.fasta
│ ├── nextalign_gene_N.translation.fasta
│ ├── nextalign_gene_ORF1a.translation.fasta
│ ├── nextalign_gene_ORF1b.translation.fasta
│ ├── nextalign_gene_ORF3a.translation.fasta
│ ├── nextalign_gene_ORF6.translation.fasta
│ ├── nextalign_gene_ORF7a.translation.fasta
│ ├── nextalign_gene_ORF7b.translation.fasta
│ ├── nextalign_gene_ORF8.translation.fasta
│ ├── nextalign_gene_ORF9b.translation.fasta
│ ├── nextalign_gene_S.translation.fasta
│ ├── nextalign.insertions.csv
│ └── ultimate.fasta
├── nextclade
│ ├── combined.fasta
│ ├── nextclade.aligned.fasta
│ ├── nextclade.auspice.json
│ ├── nextclade.csv
│ ├── nextclade.errors.csv
│ ├── nextclade_gene_E.translation.fasta
│ ├── nextclade_gene_M.translation.fasta
│ ├── nextclade_gene_N.translation.fasta
│ ├── nextclade_gene_ORF1a.translation.fasta
│ ├── nextclade_gene_ORF1b.translation.fasta
│ ├── nextclade_gene_ORF3a.translation.fasta
│ ├── nextclade_gene_ORF6.translation.fasta
│ ├── nextclade_gene_ORF7a.translation.fasta
│ ├── nextclade_gene_ORF7b.translation.fasta
│ ├── nextclade_gene_ORF8.translation.fasta
│ ├── nextclade_gene_ORF9b.translation.fasta
│ ├── nextclade_gene_S.translation.fasta
│ ├── nextclade.insertions.csv
│ ├── nextclade.json
│ ├── nextclade.ndjson
│ └── nextclade.tsv
├── pango_collapse
│ └── pango_collapse.csv
├── pangolin
│ ├── combined.fasta
│ └── lineage_report.csv
├── phytreeviz
│ ├── tree_mqc.png
│ └── tree.png
├── samtools_ampliconstats
│ ├── SRR13957125_ampliconstats.txt
│ ├── SRR13957170_ampliconstats.txt
│ └── SRR13957177_ampliconstats.txt
├── samtools_coverage
│ ├── samtools_coverage_summary.tsv
│ ├── SRR13957125.cov.hist
│ ├── SRR13957125.cov.txt
│ ├── SRR13957170.cov.hist
│ ├── SRR13957170.cov.txt
│ ├── SRR13957177.cov.hist
│ └── SRR13957177.cov.txt
├── samtools_depth
│ ├── SRR13957125.depth.txt
│ ├── SRR13957170.depth.txt
│ └── SRR13957177.depth.txt
├── samtools_flagstat
│ ├── SRR13957125.flagstat.txt
│ ├── SRR13957170.flagstat.txt
│ └── SRR13957177.flagstat.txt
├── samtools_plot_ampliconstats
│ ├── SRR13957125
│ ├── SRR13957125-combined-amp.gp
│ ├── SRR13957125-combined-amp.png
│ ├── SRR13957125-combined-coverage-1.gp
│ ├── SRR13957125-combined-coverage-1.png
│ ├── SRR13957125-combined-depth.gp
│ ├── SRR13957125-combined-depth.png
│ ├── SRR13957125-combined-read-perc.gp
│ ├── SRR13957125-combined-read-perc.png
│ ├── SRR13957125-combined-reads.gp
│ ├── SRR13957125-combined-reads.png
│ ├── SRR13957125-combined-tcoord.gp
│ ├── SRR13957125-combined-tcoord.png
│ ├── SRR13957125-combined-tdepth.gp
│ ├── SRR13957125-combined-tdepth.png
│ ├── SRR13957125-heat-amp-1.gp
│ ├── SRR13957125-heat-amp-1.png
│ ├── SRR13957125-heat-coverage-1-1.gp
│ ├── SRR13957125-heat-coverage-1-1.png
│ ├── SRR13957125-heat-read-perc-1.gp
│ ├── SRR13957125-heat-read-perc-1.png
│ ├── SRR13957125-heat-read-perc-log-1.gp
│ ├── SRR13957125-heat-read-perc-log-1.png
│ ├── SRR13957125-heat-reads-1.gp
│ ├── SRR13957125-heat-reads-1.png
│ ├── SRR13957125-SRR13957125.primertrim.sorted-amp.gp
│ ├── SRR13957125-SRR13957125.primertrim.sorted-amp.png
│ ├── SRR13957125-SRR13957125.primertrim.sorted-cov.gp
│ ├── SRR13957125-SRR13957125.primertrim.sorted-cov.png
│ ├── SRR13957125-SRR13957125.primertrim.sorted-reads.gp
│ ├── SRR13957125-SRR13957125.primertrim.sorted-reads.png
│ ├── SRR13957125-SRR13957125.primertrim.sorted-tcoord.gp
│ ├── SRR13957125-SRR13957125.primertrim.sorted-tcoord.png
│ ├── SRR13957125-SRR13957125.primertrim.sorted-tdepth.gp
│ ├── SRR13957125-SRR13957125.primertrim.sorted-tdepth.png
│ ├── SRR13957125-SRR13957125.primertrim.sorted-tsize.gp
│ ├── SRR13957125-SRR13957125.primertrim.sorted-tsize.png
│ ├── SRR13957170
│ ├── SRR13957170-combined-amp.gp
│ ├── SRR13957170-combined-amp.png
│ ├── SRR13957170-combined-coverage-1.gp
│ ├── SRR13957170-combined-coverage-1.png
│ ├── SRR13957170-combined-depth.gp
│ ├── SRR13957170-combined-depth.png
│ ├── SRR13957170-combined-read-perc.gp
│ ├── SRR13957170-combined-read-perc.png
│ ├── SRR13957170-combined-reads.gp
│ ├── SRR13957170-combined-reads.png
│ ├── SRR13957170-combined-tdepth.gp
│ ├── SRR13957170-combined-tdepth.png
│ ├── SRR13957170-heat-amp-1.gp
│ ├── SRR13957170-heat-amp-1.png
│ ├── SRR13957170-heat-coverage-1-1.gp
│ ├── SRR13957170-heat-coverage-1-1.png
│ ├── SRR13957170-heat-read-perc-1.gp
│ ├── SRR13957170-heat-read-perc-1.png
│ ├── SRR13957170-heat-read-perc-log-1.gp
│ ├── SRR13957170-heat-read-perc-log-1.png
│ ├── SRR13957170-heat-reads-1.gp
│ ├── SRR13957170-heat-reads-1.png
│ ├── SRR13957170-SRR13957170.primertrim.sorted-amp.gp
│ ├── SRR13957170-SRR13957170.primertrim.sorted-amp.png
│ ├── SRR13957170-SRR13957170.primertrim.sorted-cov.gp
│ ├── SRR13957170-SRR13957170.primertrim.sorted-cov.png
│ ├── SRR13957170-SRR13957170.primertrim.sorted-reads.gp
│ ├── SRR13957170-SRR13957170.primertrim.sorted-reads.png
│ ├── SRR13957170-SRR13957170.primertrim.sorted-tdepth.gp
│ ├── SRR13957170-SRR13957170.primertrim.sorted-tdepth.png
│ ├── SRR13957177
│ ├── SRR13957177-combined-amp.gp
│ ├── SRR13957177-combined-amp.png
│ ├── SRR13957177-combined-coverage-1.gp
│ ├── SRR13957177-combined-coverage-1.png
│ ├── SRR13957177-combined-depth.gp
│ ├── SRR13957177-combined-depth.png
│ ├── SRR13957177-combined-read-perc.gp
│ ├── SRR13957177-combined-read-perc.png
│ ├── SRR13957177-combined-reads.gp
│ ├── SRR13957177-combined-reads.png
│ ├── SRR13957177-combined-tcoord.gp
│ ├── SRR13957177-combined-tcoord.png
│ ├── SRR13957177-combined-tdepth.gp
│ ├── SRR13957177-combined-tdepth.png
│ ├── SRR13957177-heat-amp-1.gp
│ ├── SRR13957177-heat-amp-1.png
│ ├── SRR13957177-heat-coverage-1-1.gp
│ ├── SRR13957177-heat-coverage-1-1.png
│ ├── SRR13957177-heat-read-perc-1.gp
│ ├── SRR13957177-heat-read-perc-1.png
│ ├── SRR13957177-heat-read-perc-log-1.gp
│ ├── SRR13957177-heat-read-perc-log-1.png
│ ├── SRR13957177-heat-reads-1.gp
│ ├── SRR13957177-heat-reads-1.png
│ ├── SRR13957177-SRR13957177.primertrim.sorted-amp.gp
│ ├── SRR13957177-SRR13957177.primertrim.sorted-amp.png
│ ├── SRR13957177-SRR13957177.primertrim.sorted-cov.gp
│ ├── SRR13957177-SRR13957177.primertrim.sorted-cov.png
│ ├── SRR13957177-SRR13957177.primertrim.sorted-reads.gp
│ ├── SRR13957177-SRR13957177.primertrim.sorted-reads.png
│ ├── SRR13957177-SRR13957177.primertrim.sorted-tcoord.gp
│ ├── SRR13957177-SRR13957177.primertrim.sorted-tcoord.png
│ ├── SRR13957177-SRR13957177.primertrim.sorted-tdepth.gp
│ ├── SRR13957177-SRR13957177.primertrim.sorted-tdepth.png
│ ├── SRR13957177-SRR13957177.primertrim.sorted-tsize.gp
│ └── SRR13957177-SRR13957177.primertrim.sorted-tsize.png
├── samtools_stats
│ ├── 2249693-IA-M05216-230323.stats.txt
│ ├── SRR13957125.stats.txt
│ ├── SRR13957170.stats.txt
│ └── SRR13957177.stats.txt
├── seqyclean
│ ├── 2249693-IA-M05216-230323_clean_PE1.fastq.gz
│ ├── 2249693-IA-M05216-230323_clean_PE2.fastq.gz
│ ├── 2249693-IA-M05216-230323_clean_SummaryStatistics.tsv
│ ├── Combined_SummaryStatistics.tsv
│ ├── SRR13957125_clean_PE1.fastq.gz
│ ├── SRR13957125_clean_PE2.fastq.gz
│ ├── SRR13957125_clean_SummaryStatistics.tsv
│ ├── SRR13957170_clean_PE1.fastq.gz
│ ├── SRR13957170_clean_PE2.fastq.gz
│ ├── SRR13957170_clean_SummaryStatistics.tsv
│ ├── SRR13957177_clean_PE1.fastq.gz
│ ├── SRR13957177_clean_PE2.fastq.gz
│ └── SRR13957177_clean_SummaryStatistics.tsv
├── snp-dists
│ └── snp-dists.txt
└── vadr
├── combined.fasta
├── trimmed.fasta
├── vadr.vadr.alc
├── vadr.vadr.alt
├── vadr.vadr.alt.list
├── vadr.vadr.cmd
├── vadr.vadr.dcr
├── vadr.vadr.fail.fa
├── vadr.vadr.fail.list
├── vadr.vadr.fail.tbl
├── vadr.vadr.filelist
├── vadr.vadr.ftr
├── vadr.vadr.log
├── vadr.vadr.mdl
├── vadr.vadr.pass.fa
├── vadr.vadr.pass.list
├── vadr.vadr.pass.tbl
├── vadr.vadr.rpn
├── vadr.vadr.sda
├── vadr.vadr.seqstat
├── vadr.vadr.sgm
├── vadr.vadr.sqa
└── vadr.vadr.sqc
There are two main files that summarize the information from this workflow. One is a csv file with the key result from each process, and one is the multiqc report. The default location for these files is cecret/cecret_results.csv
and cecret/multiqc/multiqc_report.html
. If using an alternative destination, set params.outdir
to your preferred destination (see: instructions on how to adjust parameters).
There are summary files for each run found at cecret/cecret_results.csv
and cecret/cecret_results.txt
. These two files are exactly the same except for the delimiter used to separate the columns.
An example file run with the default values for some SARS-CoV-2 fastq files SRR13957125, SRR13957170, and SRR13957177 are below.
sample_id | sample | pangolin_lineage | nextclade_clade | vadr_p/f | fasta_line | fastqc_raw_reads_1 | fastqc_raw_reads_2 | num_N | num_total | seqyclean_PairsKept | seqyclean_Perc_Kept | num_pos_100X | aci_num_failed_amplicons | insert_size_after_trimming | ivar_num_variants_identified | bcftools_variants_identified | samtools_meandepth_after_trimming | samtools_per_1X_coverage_after_trimming | vadr_model | vadr_alerts | nextclade_clade_who | nextclade_qc_overallscore | nextclade_qc_overallstatus | pangolin_conflict | pangolin_ambiguity_score | pangolin_scorpio_call | pangolin_scorpio_support | pangolin_scorpio_conflict | pangolin_scorpio_notes | pangolin_version | pangolin_pangolin_version | pangolin_scorpio_version | pangolin_constellation_version | pangolin_is_designated | pangolin_qc_status | pangolin_qc_notes | pangolin_note | pangocollapse_lineage | pangocollapse_Lineage_full | pangocollapse_Lineage_expanded | pangocollapse_Lineage_family | freyja_summarized | Cecret version | seqyclean | bwa | ivar | ivar consensus |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SRR13957125 | SRR13957125 | B.1.429 | 21C | PASS | SRR13957125 | 670879.0 | 670879.0 | 667.0 | 29875.0 | 576244.0 | 85.8939 | 29200.0 | 2.0 | 199.0 | 27.0 | 27.0 | 5401.08 | 99.6522 | NC_045512 | - | Epsilon | 12.958697 | good | 0.0 | Epsilon (B.1.429-like) | 1.0 | 0.0 | scorpio call: Alt alleles 14; Ref alleles 0; Amb alleles 0; Oth alleles 0 | PUSHER-v1.23.1 | 4.3.1 | 0.3.19 | v0.1.12 | False | pass | Ambiguous content: 4% | Usher placements: B.1.429(1/1) | B.1.429 | B.1.429 | B.1.429 | B | [('Epsilon' 0.9990499999965543)] | v3.10.20231226 | seqyclean : Version: 1.10.09 (2018-10-16) | bwa : Version: 0.7.17-r1188 | ivar : iVar version 1.4.2 | iVar version 1.4.2 | |
SRR13957170 | SRR13957170 | Unassigned | SRR13957170 | 2287.0 | 2287.0 | 25545.0 | 25545.0 | 176.0 | 7.69567 | 0.0 | 99.0 | 160.0 | 0.0 | 6.0 | 0.182791 | 6.91235 | PUSHER-v1.23.1 | 4.3.1 | 0.3.19 | v0.1.12 | False | fail | Failed to map | Unassigned | Unassigned | Unassigned | Unassigned | v3.10.20231226 | seqyclean : Version: 1.10.09 (2018-10-16) | bwa : Version: 0.7.17-r1188 | ivar : iVar version 1.4.2 | iVar version 1.4.2 | |||||||||||||||
SRR13957177 | SRR13957177 | B.1.1.7 | 20I | PASS | SRR13957177 | 902426.0 | 902426.0 | 776.0 | 29787.0 | 837318.0 | 92.7852 | 29019.0 | 2.0 | 207.3 | 39.0 | 41.0 | 7621.74 | 99.8495 | NC_045512 | - | Alpha | 5.885816 | good | 0.0 | Alpha (B.1.1.7-like) | 0.96 | 0.04 | scorpio call: Alt alleles 22; Ref alleles 1; Amb alleles 0; Oth alleles 0 | PUSHER-v1.23.1 | 4.3.1 | 0.3.19 | v0.1.12 | False | pass | Ambiguous content: 4% | Usher placements: B.1.1.7(1/1) | B.1.1.7 | B.1.1.7 | B.1.1.7 | B.1.1.7 | [('Alpha' 0.999009112161133)] | v3.10.20231226 | seqyclean : Version: 1.10.09 (2018-10-16) | bwa : Version: 0.7.17-r1188 | ivar : iVar version 1.4.2 | iVar version 1.4.2 |
The multiqc report aggregates data across your samples into one file. Open the 'cecret/multiqc/multiqc_report.html' file with your favored browser. There tables and graphs are generated for 'General Statistics', 'Samtools stats', 'Samtools flagstats', 'FastQC', 'iVar', 'SeqyClean', 'Fastp', 'Pangolin', and 'Kraken2'. There are also added custom sections for many additional analysis including Freyja and PhyTreeViz.
Sometimes these summary files are not sufficient. This is an expected use-case, which is why there are so many more files in the workflow.
More information can be found in other pages in this wiki.
-
Core workflow (aka, what's makes Cecret)
- read cleaning
-
alignment
- optional: mark duplicates
- primer trimming
-
Nanopore read workflow
-
Quality Metrics
-
Species specific workflows
Although in the tree above, not every directory or file is intended for every run. In fact, there are several mutually exclusive processes:
- seqyclean and fastp
- ivar_trim and ampliconclip (neither will appear if
params.trimmer = 'none'
) - mafft and nextalign
Please note that any files or directories related to phylogenetic analysis will not be run or appear in results unless the relatedness parameter is set to true (params.relatedness = true
). A directory may still appear if a process is "turned off," but this directory will be empty.
Nextflow will also produce files. These files are more about the resources that your system used than about analysis of the data.
If using Nextflow Tower or Seqera Platform, these are the files that appear in the reports section of the UI.
Currently, the summary file for the workflow (cecret_results.*) and the multiqc report (multiqc_report.html) are included. More may be added once UPHL gains access to this resource.
This is a report of the workflow that may be useful in determining computational resource use. This includes the command that was used, the directories and other storage locations that were used, and information about the processes being run.
Visually shows the timeline for each process given the computational environment.
A directory for all the temporary files used in the analysis. Users can generally delete this directory when the workflow is complete because 1) it generally is not needed if everything is run successfully and 2) because it is VERY large. Users will not be able to use nextflow -resume
if this directory is deleted.