erinyoung/needles is a bioinformatics pipeline to download a corresponding pre-built poppunk database to use on input fasta files.
The steps are as follows:
- Download the poppunk database for taxid (https://www.bacpop.org/poppunk/)
- Assign fasta files to clusters
- Visualize (for microreact is the default)
Note
If you are new to Nextflow and nf-core, please refer to this page on how to set-up Nextflow. Make sure to test your setup with -profile test
before running the workflow on actual data.
First, prepare a samplesheet with your input data that looks as follows:
samplesheet.csv
:
sample,fasta
sample1,sample1.fasta
Each row represents a fasta file.
The next step is to select a taxid. A list of poppunk databases are in json format at (assets/poppunkdb.json)(./assets/poppunk_db.json). The default taxid is 1314
for _Streptococcus pyogenes.
Current options for taxid:
- "470" : "Acinetobacter baumannii"
- "520" : "Bordetella pertussis"
- "197" : "Campylobacter jejuni"
- "5476" : "Candida albicans"
- "1351" : "Enterococcus faecalis"
- "1352" : "Enterococcus faecium"
- "562" : "Escherichia coli"
- "727" : "Haemophilus influenzae"
- "210" : "Helicobacter pylori"
- "197911" : "Influenza virus"
- "573" : "Klebsiella pneumoniae"
- "446" : "Legionella pneumophila"
- "1639" : "Listeria monocytogenes"
- "36809" : "Mycobacterium abscessus"
- "1773" : "Mycobacterium tuberculosis"
- "485" : "Neisseria gonorrhoeae"
- "487_2" : "Neisseria meningitidis" from "https://doi.org/10.12688/wellcomeopenres.14826.1"
- "487" : "Neisseria meningitidis" from "bacpop/PopPUNK#267"
- "287" : "Pseudomonas aeruginosa"
- "4932" : "Saccharomyces cerevisiae"
- "590" : "Salmonella sp."
- "1280" : "Staphylococcus aureus",
- "40324" : "Stenotrophomonas maltophilia"
- "1311" : "Streptococcus agalactiae"
- "1334" : "Streptococcus dysgalactiae subspecies equisimilis"
- "28037" : "Streptococcus mitis",
- "1313" : "Streptococcus pneumoniae"
- "1314" : "Streptococcus pyogenes"
- "1307" : "Streptococcus suis"
Now, you can run the pipeline using:
nextflow run erinyoung/needles \
-profile <docker/singularity/.../institute> \
--input samplesheet.csv \
--taxid <TAXID> \
--outdir <OUTDIR>
Warning
Please provide pipeline parameters via the CLI or Nextflow -params-file
option. Custom config files including those provided by the -c
Nextflow option can be used to provide any configuration except for parameters; see docs.
erinyoung/needles was originally written by Erin Young, and is mostly a wrapper for using Poppunk.
Poppunk can be cited with the following paper:
Lees JA, Harris SR, Tonkin-Hill G, Gladstone RA, Lo SW, Weiser JN, Corander J, Bentley SD, Croucher NJ. Fast and flexible bacterial genomic epidemiology with PopPUNK. Genome Research 29:304-316 (2019). doi:10.1101/gr.241455.118
This pipeline uses code and infrastructure developed and maintained by the nf-core community, reused here under the MIT license.
The nf-core framework for community-curated bioinformatics pipelines.
Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso & Sven Nahnsen.
Nat Biotechnol. 2020 Feb 13. doi: 10.1038/s41587-020-0439-x.