Skip to content

Commit

Permalink
Merge pull request #585 from mvdbeek/haploid_variant_calling
Browse files Browse the repository at this point in the history
Add haploid variant calling workflow
  • Loading branch information
mvdbeek authored Oct 29, 2024
2 parents ece941f + 2b9da24 commit 9b4a941
Show file tree
Hide file tree
Showing 6 changed files with 877 additions and 0 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
version: 1.2
workflows:
- name: main
subclass: Galaxy
publish: true
primaryDescriptorPath: /WGS-PE-variant-calling-in-haploid-system.ga
testParameterFiles:
- /WGS-PE-variant-calling-in-haploid-system-tests.yml
authors:
- name: Anton Nekrutenko
orcid: 0000-0002-5987-8032
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# Changelog


## [0.1]

- Initial version of Paired end variant calling in haploid system workflow
22 changes: 22 additions & 0 deletions workflows/variant-calling/haploid-variant-calling-wgs-pe/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Haploid variant calling for whole genome sequencing paired end data

This workflow uses Illumina or Element read data to discover variants (short nucleotide polymorphisms, SNPs, and small indels) in haploid genomes with multiple genomic sequences (contigs, scaffolds, or chromosomes).

## Inputs dataset

- The workflow needs a list of paired end fastq files
- A GTF containtaing the Gene annotation for the selected haploid genome
- A fasta file for the haploid genome to call variants against

## Outputs

- Tab-delimited summary of annotated variants
- Report summarizing the quality of input data and mapping results

## Processing

- The workflow will remove adapters using fastp
- The filtered reads are aligned with bwa-mem.
- Only properly aligned mate pairs are retained, PCR duplicates are removed.
- Alignments are re-aligned using lofreq viterbi and variants are called with lofreq call.
- Variants are annotated with snpeff eff
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
- doc: Test outline for WGS-PE-variant-calling-in-haploid-system
job:
Annotation GTF:
class: File
location: https://zenodo.org/records/14009320/files/Annotation%20GTF.gtf?download=1
filetype: gtf
Genome fasta:
class: File
location: https://zenodo.org/records/14009320/files/Genome%20fasta.fasta.gz?download=1
filetype: fasta.gz
Paired Collection:
class: Collection
collection_type: list:paired
elements:
- class: Collection
type: paired
identifier: ERR018930
elements:
- class: File
identifier: forward
location: https://zenodo.org/records/14009320/files/ERR018930_forward.fastqsanger.gz?download=1
- class: File
identifier: reverse
location: https://zenodo.org/records/14009320/files/ERR018930_reverse.fastqsanger.gz?download=1
- class: Collection
type: paired
identifier: ERR1035492
elements:
- class: File
identifier: forward
location: https://zenodo.org/records/14009320/files/ERR1035492_forward.fastqsanger.gz?download=1
- class: File
identifier: reverse
location: https://zenodo.org/records/14009320/files/ERR1035492_reverse.fastqsanger.gz?download=1
outputs:
Annotated Variants:
path: test-data/Annotated Variants.tabular
SnpEff variants:
element_tests:
ERR018930:
asserts:
- has_line:
line: 'NC_009906.1 3204 . A G 120.0 PASS DP=22;AF=0.727273;SB=2;DP4=2,3,3,14;EFF=INTRAGENIC(MODIFIER|||||PVX_087665||NON_CODING|||G)'
- has_line:
line: 'NC_009906.1 3261 . C A 52.0 PASS DP=15;AF=0.333333;SB=0;DP4=3,7,2,3;EFF=INTRAGENIC(MODIFIER|||||PVX_087665||NON_CODING|||A)'
ERR1035492:
asserts:
has_line:
line: 'NC_009906.1 2975 . A G 75.0 PASS DP=26;AF=0.692308;SB=0;DP4=5,3,12,6;EFF=INTRAGENIC(MODIFIER|||||PVX_087665||NON_CODING|||G)'
Loading

0 comments on commit 9b4a941

Please sign in to comment.