Add cg/wgMLST query pipeline #2

apetkau · 2023-08-19T22:55:54Z

1. Purpose

The purpose of this pipeline is to query for genomes within a certain threshold of a collection of genomes.

2. Input

2.1. Query profiles

The input will consist of cg/wgMLST profiles for queries and a reference selection/scope of this query. This will be passed via the --input parameter and will look like the following:

querysheet.csv:

identifier	query_allele_profiles	reference_allele_profiles
query1	/path/to/query_allele_profiles	/path/to/reference_allele_profiles
query2	/path/to/query_allele_profiles	/path/to/reference_allele_profiles

2.1.1. Allele profiles (CSV)

The following example format will be used for the allele profiles for the CSV format (both uncompressed and gzipped files will be supported).

id	loci1	loci2	...	lociN
SampleA	be76	af5d	ce78	d877a
ID10	af5d	be76	?	d877a

3. Steps

3.1. Perform query

For each listed query, this will search for genomes within a particular threshold. This will use https://github.com/phac-nml/profile_dists.

4. Output

The following output will be provided. This will be communicated with an output.json file with the following larger structure:

{
    "files": { ... },
    "sample_metadata": { ... }
    "execution_metadata": { ... },
}

The text was updated successfully, but these errors were encountered:

apetkau · 2023-08-20T00:30:18Z

Example implementation is at https://github.com/apetkau/nf-core-queryprofiles

This can be run directly from GitHub if you have Nextflow and Docker installed by:

nextflow run https://github.com/apetkau/nf-core-queryprofiles -profile docker,test -r dev --outdir results

apetkau added the pipeline An issue describing a pipeline label Aug 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cg/wgMLST query pipeline #2

Add cg/wgMLST query pipeline #2

apetkau commented Aug 19, 2023 •

edited

Loading

apetkau commented Aug 20, 2023

Add cg/wgMLST query pipeline #2

Add cg/wgMLST query pipeline #2

Comments

apetkau commented Aug 19, 2023 • edited Loading

1. Purpose

2. Input

2.1. Query profiles

2.1.1. Allele profiles (CSV)

3. Steps

3.1. Perform query

4. Output

apetkau commented Aug 20, 2023

apetkau commented Aug 19, 2023 •

edited

Loading