You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was running some of the test datasets in preparation to input a high complexity dataset, and encountered an error with PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE
The data input was metadata_PRJNA523365_small.csv
ERROR ~ Error executing process > 'PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE (1)'
Caused by:
Process PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE input file name collision -- There are multiple input files for each of the following file names: GCF_017189435_1.sig
The metadata file has some new columns compared to the other test datasets, and I wonder if this contributed.
Everything else leading up to this ran as expected.
Command used and terminal output
(nf-core) marthasudermann@pop-os:~/pathogensurveillance$ nextflow run main.nf --input 'https://raw.githubusercontent.com/grunwaldlab/pathogensurveillance/master/test/data/metadata_PRJNA523365_small.csv' --outdir test_out4 --bakta_db /home/marthasudermann/Software/bakta_db_02_2024/db/ -profile docker -resumeN E X T F L O W ~ version 23.10.1Launching `main.nf` [confident_engelbart] DSL2 - revision: cc83aa0c27------------------------------------------------------ ,--./,-. ___ __ __ __ ___ /,-._.--~' |\ | |__ __ / ` / \ |__) |__ } { | \| | \__, \__/ | \ |___ \`-._,-`-, `._,._,' nf-core/plantpathsurveil v1.0dev------------------------------------------------------Core Nextflow options runName : confident_engelbart containerEngine: docker launchDir : /home/marthasudermann/pathogensurveillance workDir : /home/marthasudermann/pathogensurveillance/work projectDir : /home/marthasudermann/pathogensurveillance userName : marthasudermann profile : docker configFiles : /home/marthasudermann/pathogensurveillance/nextflow.configInput/output options input : https://raw.githubusercontent.com/grunwaldlab/pathogensurveillance/master/test/data/metadata_PRJNA523365_small.csv outdir : test_out4 bakta_db : /home/marthasudermann/Software/bakta_db_02_2024/db/!! Only displaying parameters that differ from the pipeline defaults !!------------------------------------------------------If you use nf-core/plantpathsurveil for your analysis please cite:* The nf-core framework https://doi.org/10.1038/s41587-020-0439-x* Software dependencies https://github.com/nf-core/plantpathsurveil/blob/master/CITATIONS.md------------------------------------------------------[- ] process > PATHOGENSURVEILLANCE:INPUT_CHECK:SAMPLESHEET_CHECK -[- ] process > PATHOGENSURVEILLANCE:SRATOOLS_FASTERQDUMP -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_ASSEMBLIES -[- ] process > PATHOGENSURVEILLANCE:FASTQC -[- ] process > PATHOGENSURVEILLANCE:COARSE_SAMPLE_TAXONOMY:BBMAP_SENDSKETCH -[- ] process > PATHOGENSURVEILLANCE:COARSE_SAMPLE_TAXONOMY:INITIAL_CLASSIFICATION -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:FIND_ASSEMBLIES -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:PICK_ASSEMBLIES -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:DOWNLOAD_ASSEMBLIES -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:MAKE_GFF_WITH_FASTA -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:SOURMASH_SKETCH_GENOME -[- ] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SUBSET_READS -[- ] process > PATHOGENSURVEILLANCE:INPUT_CHECK:SAMPLESHEET_CHECK -[- ] process > PATHOGENSURVEILLANCE:SRATOOLS_FASTERQDUMP -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_ASSEMBLIES -[- ] process > PATHOGENSURVEILLANCE:FASTQC -[- ] process > PATHOGENSURVEILLANCE:COARSE_SAMPLE_TAXONOMY:BBMAP_SENDSKETCH -[- ] process > PATHOGENSURVEILLANCE:COARSE_SAMPLE_TAXONOMY:INITIAL_CLASSIFICATION -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:FIND_ASSEMBLIES -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:PICK_ASSEMBLIES -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:DOWNLOAD_ASSEMBLIES -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:MAKE_GFF_WITH_FASTA -[- ] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:SOURMASH_SKETCH_GENOME -[- ] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SUBSET_READS -[- ] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:KHMER_TRIMLOWABUND -[- ] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_SKETCH_READS -[- ] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_SKETCH_GENOME -[- ] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE -[f5/6125a3] process > PATHOGENSURVEILLANCE:INPUT_CHECK:SAMPLESHEET_CHECK (metadata_PRJNA523365_small.csv) [100%] 1 of 1, cached: 1 ✔[21/b6b96f] process > PATHOGENSURVEILLANCE:SRATOOLS_FASTERQDUMP (SRR12574846) [100%] 3 of 3, cached: 3 ✔[42/f6d4e0] process > PATHOGENSURVEILLANCE:DOWNLOAD_ASSEMBLIES (GCF_017189435_1) [100%] 1 of 1, cached: 1 ✔[63/a974dd] process > PATHOGENSURVEILLANCE:FASTQC (SRR12574847) [100%] 3 of 3, cached: 3 ✔[c2/8679ec] process > PATHOGENSURVEILLANCE:COARSE_SAMPLE_TAXONOMY:BBMAP_SENDSKETCH (SRR12574848) [100%] 3 of 3, cached: 3 ✔[a4/18385c] process > PATHOGENSURVEILLANCE:COARSE_SAMPLE_TAXONOMY:INITIAL_CLASSIFICATION (SRR12574846) [100%] 3 of 3, cached: 3 ✔[24/1a3d6e] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:FIND_ASSEMBLIES (Mycobacteriaceae) [100%] 1 of 1, cached: 1 ✔[6a/673614] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:PICK_ASSEMBLIES (SRR12574846) [100%] 3 of 3, cached: 3 ✔[1d/f003e6] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:DOWNLOAD_ASSEMBLIES (GCF_001677215_1) [100%] 9 of 9, cached: 9 ✔[65/b881e2] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:MAKE_GFF_WITH_FASTA (GCF_001456355_1) [100%] 9 of 9, cached: 9 ✔[db/0526c0] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:SOURMASH_SKETCH_GENOME (GCF_001456355_1) [100%] 9 of 9, cached: 9 ✔[6f/fce62b] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SUBSET_READS (SRR12574846) [100%] 3 of 3, cached: 3 ✔[cc/a2f30c] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:KHMER_TRIMLOWABUND (SRR12574847) [100%] 3 of 3, cached: 3 ✔[9e/09a4cb] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_SKETCH_READS (SRR12574847) [100%] 3 of 3, cached: 3 ✔[d3/a6b247] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_SKETCH_GENOME (GCF_017189435_1) [100%] 3 of 3, cached: 3 ✔[- ] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE -[- ] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:ASSIGN_GROUP_REFERENCES -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:REFERENCE_INDEX:PICARD_CREATESEQUENCEDICTIONARY -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:REFERENCE_INDEX:SAMTOOLS_FAIDX -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:REFERENCE_INDEX:BWA_INDEX -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:CALCULATE_DEPTH -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:SUBSET_READS -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:BWA_MEM -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:PICARD_ADDORREPLACEREADGROUPS -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:PICARD_SORTSAM_1 -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:PICARD_MARKDUPLICATES -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:PICARD_SORTSAM_2 -[f5/6125a3] process > PATHOGENSURVEILLANCE:INPUT_CHECK:SAMPLESHEET_CHECK (metadata_PRJNA523365_small.csv) [100%] 1 of 1, cached: 1 ✔[21/b6b96f] process > PATHOGENSURVEILLANCE:SRATOOLS_FASTERQDUMP (SRR12574846) [100%] 3 of 3, cached: 3 ✔[42/f6d4e0] process > PATHOGENSURVEILLANCE:DOWNLOAD_ASSEMBLIES (GCF_017189435_1) [100%] 1 of 1, cached: 1 ✔[63/a974dd] process > PATHOGENSURVEILLANCE:FASTQC (SRR12574847) [100%] 3 of 3, cached: 3 ✔[c2/8679ec] process > PATHOGENSURVEILLANCE:COARSE_SAMPLE_TAXONOMY:BBMAP_SENDSKETCH (SRR12574848) [100%] 3 of 3, cached: 3 ✔[a4/18385c] process > PATHOGENSURVEILLANCE:COARSE_SAMPLE_TAXONOMY:INITIAL_CLASSIFICATION (SRR12574846) [100%] 3 of 3, cached: 3 ✔[24/1a3d6e] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:FIND_ASSEMBLIES (Mycobacteriaceae) [100%] 1 of 1, cached: 1 ✔[6a/673614] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:PICK_ASSEMBLIES (SRR12574846) [100%] 3 of 3, cached: 3 ✔[1d/f003e6] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:DOWNLOAD_ASSEMBLIES (GCF_001677215_1) [100%] 9 of 9, cached: 9 ✔[65/b881e2] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:MAKE_GFF_WITH_FASTA (GCF_001456355_1) [100%] 9 of 9, cached: 9 ✔[db/0526c0] process > PATHOGENSURVEILLANCE:DOWNLOAD_REFERENCES:SOURMASH_SKETCH_GENOME (GCF_001456355_1) [100%] 9 of 9, cached: 9 ✔[6f/fce62b] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SUBSET_READS (SRR12574846) [100%] 3 of 3, cached: 3 ✔[cc/a2f30c] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:KHMER_TRIMLOWABUND (SRR12574847) [100%] 3 of 3, cached: 3 ✔[9e/09a4cb] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_SKETCH_READS (SRR12574847) [100%] 3 of 3, cached: 3 ✔[d3/a6b247] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_SKETCH_GENOME (GCF_017189435_1) [100%] 3 of 3, cached: 3 ✔[- ] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE -[- ] process > PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:ASSIGN_GROUP_REFERENCES -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:REFERENCE_INDEX:PICARD_CREATESEQUENCEDICTIONARY -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:REFERENCE_INDEX:SAMTOOLS_FAIDX -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:REFERENCE_INDEX:BWA_INDEX -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:CALCULATE_DEPTH -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:SUBSET_READS -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:BWA_MEM -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:PICARD_ADDORREPLACEREADGROUPS -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:PICARD_SORTSAM_1 -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:PICARD_MARKDUPLICATES -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:PICARD_SORTSAM_2 -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:ALIGN_READS:SAMTOOLS_INDEX -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:CALL_VARIANTS:MAKE_REGION_FILE -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:CALL_VARIANTS:GRAPHTYPER_GENOTYPE -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:CALL_VARIANTS:GRAPHTYPER_VCFCONCATENATE -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:CALL_VARIANTS:TABIX_TABIX -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:CALL_VARIANTS:BGZIP_MAKE_GZIP -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:CALL_VARIANTS:GATK4_VARIANTFILTRATION -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:CALL_VARIANTS:VCFLIB_VCFFILTER -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:VCF_TO_TAB -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:VCF_TO_SNPALN -[- ] process > PATHOGENSURVEILLANCE:VARIANT_ANALYSIS:IQTREE2_SNP -[- ] process > PATHOGENSURVEILLANCE:GENOME_ASSEMBLY:SUBSET_READS -[- ] process > PATHOGENSURVEILLANCE:GENOME_ASSEMBLY:FASTP -[- ] process > PATHOGENSURVEILLANCE:GENOME_ASSEMBLY:SPADES -[- ] process > PATHOGENSURVEILLANCE:GENOME_ASSEMBLY:FILTER_ASSEMBLY -[- ] process > PATHOGENSURVEILLANCE:GENOME_ASSEMBLY:QUAST -[- ] process > PATHOGENSURVEILLANCE:GENOME_ASSEMBLY:BAKTA_BAKTA -[- ] process > PATHOGENSURVEILLANCE:CORE_GENOME_PHYLOGENY:PIRATE -[- ] process > PATHOGENSURVEILLANCE:CORE_GENOME_PHYLOGENY:REFORMAT_PIRATE_RESULTS -[- ] process > PATHOGENSURVEILLANCE:CORE_GENOME_PHYLOGENY:ALIGN_FEATURE_SEQUENCES -[- ] process > PATHOGENSURVEILLANCE:CORE_GENOME_PHYLOGENY:RENAME_CORE_GENE_HEADERS -[- ] process > PATHOGENSURVEILLANCE:CORE_GENOME_PHYLOGENY:SUBSET_CORE_GENES -[- ] process > PATHOGENSURVEILLANCE:CORE_GENOME_PHYLOGENY:MAFFT_SMALL -[- ] process > PATHOGENSURVEILLANCE:CORE_GENOME_PHYLOGENY:IQTREE2_CORE -[- ] process > PATHOGENSURVEILLANCE:CUSTOM_DUMPSOFTWAREVERSIONS -[- ] process > PATHOGENSURVEILLANCE:MULTIQC -[- ] process > PATHOGENSURVEILLANCE:RECORD_MESSAGES -[- ] process > PATHOGENSURVEILLANCE:MAIN_REPORT -ERROR ~ Error executing process > 'PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE (1)'Caused by: Process `PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE` input file name collision -- There are multiple input files for each of the following file names: GCF_017189435_1.sigTip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line -- Check '.nextflow.log' file for details
As a quick follow-up: with a separate test dataset (high_complexity_kpneumoniae), I encountered a second error at (PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE) step. When I go into the output directory sourmash_sketch_genome, I see signatures for several assemblies and then a final null.sig. Remaning ignature files look fine. I didn't specify any user-defined references and just have a 'sample_id' column and 'sra' column in my input metadata sheet.
ERROR ~ Error executing process > 'PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE (1)'
Caused by:
Process PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE input file name collision -- There are multiple input files for each of the following file names: null.sig
Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh
-- Check '.nextflow.log' file for details
My main command was as follows (with test data config and metadata files in the appropriate location) nextflow run main.nf --input /home/marthasudermann/pathogensurveillance/test/data/metadata_high_complexity_kpneumoniae.csv --outdir test_highcomplexity2 --bakta_db /home/marthasudermann/Software/bakta_db_02_2024/db/ -profile docker -resume
Description of the bug
I was running some of the test datasets in preparation to input a high complexity dataset, and encountered an error with
PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE
The data input was metadata_PRJNA523365_small.csv
ERROR ~ Error executing process > 'PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE (1)'
Caused by:
Process
PATHOGENSURVEILLANCE:ASSIGN_REFERENCES:SOURMASH_COMPARE
input file name collision -- There are multiple input files for each of the following file names: GCF_017189435_1.sigThe metadata file has some new columns compared to the other test datasets, and I wonder if this contributed.
Everything else leading up to this ran as expected.
Command used and terminal output
Relevant files
nextflow.log
System information
Nextflow 23.10.1.5891
Desktop
local
Docker
Linux
The text was updated successfully, but these errors were encountered: