Skip to content

Commit

Permalink
Add config and help for lofreq call
Browse files Browse the repository at this point in the history
  • Loading branch information
KaiWaldrant committed Feb 5, 2024
1 parent f54af0a commit 4f283b5
Show file tree
Hide file tree
Showing 2 changed files with 288 additions and 0 deletions.
239 changes: 239 additions & 0 deletions src/lofreq/call/config.vsh.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
functionality:
name: lofreq_call
description: |
Call variants from a BAM file.
LoFreq* (i.e. LoFreq version 2) is a fast and sensitive variant-caller for inferring SNVs and indels from next-generation sequencing data. It makes full use of base-call qualities and other sources of errors inherent in sequencing (e.g. mapping or base/indel alignment uncertainty), which are usually ignored by other methods or only used for filtering.
LoFreq* can run on almost any type of aligned sequencing data (e.g. Illumina, IonTorrent or Pacbio) since no machine- or sequencing-technology dependent thresholds are used. It automatically adapts to changes in coverage and sequencing quality and can therefore be applied to a variety of data-sets e.g. viral/quasispecies, bacterial, metagenomics or somatic data.
LoFreq* is very sensitive; most notably, it is able to predict variants below the average base-call quality (i.e. sequencing error rate). Each variant call is assigned a p-value which allows for rigorous false positive control. Even though it uses no approximations or heuristics, it is very efficient due to several runtime optimizations and also provides a (pseudo-)parallel implementation. LoFreq* is generic and fast enough to be applied to high-coverage data and large genomes. On a single processor it takes a minute to analyze Dengue genome sequencing data with nearly 4000X coverage, roughly one hour to call SNVs on a 600X coverage E.coli genome and also roughly an hour to run on a 100X coverage human exome dataset.
info:
keywords: [ "variant calling", "low frequancy variant calling", "lofreq", "lofreq/call"]
homepage: https://csb5.github.io/lofreq/
documentation: https://csb5.github.io/lofreq/commands/
reference: doi:10.1093/nar/gks918
license: "MIT"
requirements:
commands: [ lofreq ]
argument_groups:
- name: Inputs
arguments:
- name: --input
type: file
description: |
Input BAM file.
required: true
example: "normal.bam"
- name: Outputs
arguments:
- name: --output
alternatives: -o
type: file
description: |
Vcf output file [- = stdout].
required: true
direction: output
example: "output.vcf"
- name: Arguments
arguments:
- name: --ref
alternatives: -f
type: file
description: |
Indexed reference fasta file (gzip supported) [null].
required: false
example: "reference.fasta"
- name: --region
alternatives: -r
type: string
description: |
Limit calls to this region (chrom:start-end) [null].
required: false
example: "chr1:1000-2000"
- name: --bed
alternatives: -l
type: file
description: |
List of positions (chr pos) or regions (BED) [null].
required: false
example: "regions.bed"
- name: --min_bq
alternatives: -q
type: integer
description: |
Skip any base with baseQ smaller than INT [6].
required: false
example: 6
- name: --min_alt_bq
alternatives: -Q
type: integer
description: |
Skip alternate bases with baseQ smaller than INT [6].
required: false
example: 6
- name: --def_alt_bq
alternatives: -R
type: integer
description: |
Overwrite baseQs of alternate bases (that passed bq filter) with this value (-1: use median ref-bq; 0: keep) [0].
required: false
example: 0
- name: --min_jq
alternatives: -j
type: integer
description: |
Skip any base with joinedQ smaller than INT [0].
example: 0
- name: --min_alt_jq
alternatives: -J
type: integer
description: |
Skip alternate bases with joinedQ smaller than INT [0].
required: false
example: 0
- name: --def_alt_jq
alternatives: -K
type: integer
description: |
Overwrite joinedQs of alternate bases (that passed jq filter) with this value (-1: use median ref-bq; 0: keep) [0].
required: false
example: 0
- name: --no_baq
alternatives: -B
type: boolean_true
description: |
Disable use of base-alignment quality (BAQ).
- name: --no_idaq
alternatives: -A
type: boolean_true
description: |
Don't use IDAQ values (NOT recommended under ANY circumstances other than debugging).
- name: --del_baq
alternatives: -D
type: boolean_true
description: |
Delete pre-existing BAQ values, i.e. compute even if already present in BAM.
- name: --no_ext_baq
alternatives: -e
type: boolean_true
description: |
Use 'normal' BAQ (samtools default) instead of extended BAQ (both computed on the fly if not already present in lb tag).
- name: --min_mq
alternatives: -m
type: integer
description: |
Skip reads with mapping quality smaller than INT [0].
required: false
example: 0
- name: --max_mq
alternatives: -M
type: integer
description: |
Cap mapping quality at INT [255].
required: false
example: 255
- name: --no_mq
alternatives: -N
type: boolean_true
description: |
Don't merge mapping quality in LoFreq's model.
- name: --call_indels
type: boolean_true
description: |
Enable indel calls (note: preprocess your file to include indel alignment qualities!).
- name: --only_indels
type: boolean_true
description: |
Only call indels; no SNVs.
- name: --src_qual
alternatives: -s
type: boolean_true
description: |
Enable computation of source quality.
- name: --ign_vcf
alternatives: -S
type: file
description: |
Ignore variants in this vcf file for source quality computation. Multiple files can be given separated by commas.
required: false
multiple: true
separator: ","
example: "variants.vcf"
- name: --def_nm_q
alternatives: -T
type: integer
description: |
If >= 0, then replace non-match base qualities with this default value [-1].
required: false
example: -1
- name: --sig
alternatives: -a
type: double
description: |
P-Value cutoff / significance level [0.010000]
required: false
example: 0.01
- name: --bonf
alternatives: -b
type: string
description: |
Bonferroni factor. 'dynamic' (increase per actually performed test) or INT ['dynamic']
required: false
example: "dynamic"
- name: --min_cov
alternatives: -C
type: integer
description: |
Test only positions having at least this coverage [1].
(note: without --no-default-filter default filters (incl. coverage) kick in after predictions are done).
required: false
example: 1
- name: --max_depth
alternatives: -d
type: integer
description: |
Cap coverage at this depth [1000000].
required: false
example: 1000000
- name: --illumina_13
type: boolean_true
description: |
Assume the quality is Illumina-1.3-1.7/ASCII+64 encoded.
- name: --use_orphan
type: boolean_true
description: |
Count anomalous read pairs (i.e. where mate is not aligned properly).
- name: --plp_summary_only
type: boolean_true
description: |
No variant calling. Just output pileup summary per column.
- name: --no_default_filter
type: boolean_true
description: |
Don't run default 'lofreq filter' automatically after calling variants.
- name: --force_overwrite
type: boolean_true
description: |
Overwrite any existing output.
- name: --verbose
type: boolean_true
description: |
Be verbose.
- name: --debug
type: boolean_true
description: |
Enable debugging.
resources:
- type: bash_script
path: script.sh
platforms:
- type: docker
image: quay.io/biocontainers/lofreq:2.1.5--py38h794fc9e_10
setup:
- type: docker
run: |
version=$(lofreq version | grep 'version' | sed 's/version: //') && \
echo "lofreq: $version" > /var/software_versions.txt
- type: nextflow

49 changes: 49 additions & 0 deletions src/lofreq/call/help.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
lofreq call: call variants from BAM file

Usage: lofreq call [options] in.bam

Options:
- Reference:
-f | --ref FILE Indexed reference fasta file (gzip supported) [null]
- Output:
-o | --out FILE Vcf output file [- = stdout]
- Regions:
-r | --region STR Limit calls to this region (chrom:start-end) [null]
-l | --bed FILE List of positions (chr pos) or regions (BED) [null]
- Base-call quality:
-q | --min-bq INT Skip any base with baseQ smaller than INT [6]
-Q | --min-alt-bq INT Skip alternate bases with baseQ smaller than INT [6]
-R | --def-alt-bq INT Overwrite baseQs of alternate bases (that passed bq filter) with this value (-1: use median ref-bq; 0: keep) [0]
-j | --min-jq INT Skip any base with joinedQ smaller than INT [0]
-J | --min-alt-jq INT Skip alternate bases with joinedQ smaller than INT [0]
-K | --def-alt-jq INT Overwrite joinedQs of alternate bases (that passed jq filter) with this value (-1: use median ref-bq; 0: keep) [0]
- Base-alignment (BAQ) and indel-aligment (IDAQ) qualities:
-B | --no-baq Disable use of base-alignment quality (BAQ)
-A | --no-idaq Don't use IDAQ values (NOT recommended under ANY circumstances other than debugging)
-D | --del-baq Delete pre-existing BAQ values, i.e. compute even if already present in BAM
-e | --no-ext-baq Use 'normal' BAQ (samtools default) instead of extended BAQ (both computed on the fly if not already present in lb tag)
- Mapping quality:
-m | --min-mq INT Skip reads with mapping quality smaller than INT [0]
-M | --max-mq INT Cap mapping quality at INT [255]
-N | --no-mq Don't merge mapping quality in LoFreq's model
- Indels:
--call-indels Enable indel calls (note: preprocess your file to include indel alignment qualities!)
--only-indels Only call indels; no SNVs
- Source quality:
-s | --src-qual Enable computation of source quality
-S | --ign-vcf FILE Ignore variants in this vcf file for source quality computation. Multiple files can be given separated by commas
-T | --def-nm-q INT If >= 0, then replace non-match base qualities with this default value [-1]
- P-values:
-a | --sig P-Value cutoff / significance level [0.010000]
-b | --bonf Bonferroni factor. 'dynamic' (increase per actually performed test) or INT ['dynamic']
- Misc.:
-C | --min-cov INT Test only positions having at least this coverage [1]
(note: without --no-default-filter default filters (incl. coverage) kick in after predictions are done)
-d | --max-depth INT Cap coverage at this depth [1000000]
--illumina-1.3 Assume the quality is Illumina-1.3-1.7/ASCII+64 encoded
--use-orphan Count anomalous read pairs (i.e. where mate is not aligned properly)
--plp-summary-only No variant calling. Just output pileup summary per column
--no-default-filter Don't run default 'lofreq filter' automatically after calling variants
--force-overwrite Overwrite any existing output
--verbose Be verbose
--debug Enable debugging

0 comments on commit 4f283b5

Please sign in to comment.