All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- NMD information (e.g., escape rule,...) is now also calculated for all variants
- Added sequence similarity filter for MHC-I
- self-similarity (using kernel similarity)
- pathogen similarity (BLAST against pathogen-derived epitopes from IEDB)
- proteome similarity (BLAST against human proteome)
- Prioritization of neoantigens is now done separately for each variant type (speeds up the process)
- Update to recent version of ScanExitron
- this version updated to recent version of regtools (v0.5.0) - which is available on Conda
- Singularity/Docker is not necessary anymore
- Added option to use strand information in exitron calling
- ScanNeo2 now uses conda environments for all tools (ditched Singularity/Docker)
- renamed similarity fields for pathogen and protein to more descriptive names
- Fixed missing alleles in HLA alleles reference list - #34
- Updated transindel environment to recent samtools version (as --o introduced in samtools >= 1.13 required by transindel)
- Allow to combine multiple VCF files in indel detection using mutect2 (e.g., when multiple samples are provided)
- Splitted rules in HLA typing to ensure better distribution of the workload
- Changed order in HLA typing rules (BAM files are now part of single-end)
- samtools fastq is only called for BAM files
- input of filtering directly from preprocessed/raw reads
- Added threads option to samtools sort calls to speed up the process
- Fixed wrong call to optitype within the wrapper script
- Separated samtools, bcftools and realign environments to avoid conflicts
- Changed order of genotyping rules to catch errors when no alleles can be found
- Alleles are merged according to nartype (e.g., DNA, RNA) and then combined
- Force concat of VCF files in genotyping to avoid errors when no variants are found
- Added optitype wrapper to avoid errors when empty BAM files are provided / no HLA reads
- Added routines to catch errors when rnaseq data is not provided but exitron/alternative splicing calling is activated
- Added reference genome index as input to germline indel calling (necessary when only indel calling is activated)
- removed -C from BWA mem call (on DNAseq data) to avoid error on Illumina identifiers
- Wrong indentation in HLAtyping caused error when providing no normal sample (NoneType was being iterated)
- Fixed missing input in get_reads_hlatyping_PE rule (tmp folder) that caused error when using paired-end reads
- Added else case in get_input_hlatyping function (rule get_reads_hlatyping_PE) for input reads when preproc is deactivated
- Added concurrency to splAdder call
- Added routines that lets ScanNeo2 finish (even when splAdder results are empty or faulty)
- Added paramter nonchr in reference attribute to exclude non-chromosomal contigs from the reference genome
- Fixed wrong path in quality control for single-end reads
- Conda instal wheel caused error on the spladder environment
- pysam requires exactly python=3.6
- Removed hlahd path from config and hlatyping - needs to be installed in $PATH
- ScanNeo2 supports Snakmake>=8
- --use-conda replaced by --software-deployment-method conda
- --use-singularity replaced by --software-deployment-method apptainer
- Gather/scatter of the indel calling speeds up ScanNeo2 on multiple cores
- added script to split bamfiles by chromosome (scripts/split_bam_by_chr.py)
- haplotypecaller first/final round is done per chromosome and later merged
- mutect2 is done per chromosome and later merged
- Genotyping MHC-II works now on both single-end and paired-end
- User-defined HLA alleles are matched against the hla refset
- Added multiple routine to catch errors when only custom variants are provided
- Added additional parameters in config file
- When using BAMfiles the HLA typing wrongly expected single-end reads and performed preprocessing
- Each environment is no thoroughly versioned to ensure interoperability
- Missing immunogenicity calculation on certain values of MHC-I fixed
- Fixed prediction of binding affinity in MHC-II (as the columns are different from MHC-I)
- linked rules for prediction of binding affinities and immunogenicity to input of prioritization
- fixed wrong reference genome in exitron2vcf call (which forced ScanNeo2 to use alternative rules)
- removed redundant rules for alternative genomes (ScanNeo2 now uses ensembl globally)
- added input directive in rule
prepare_cds
(exitron rules). Makes sure that annotations are present if exitron calling is executed first
- added routines/fixed issues when no normal sample is provided
- Added alternative link for VEP cache to improve download speed
- Added missing scripts to modify the ensembl header
- Modularized rule for long indel detection
- Fixed errors when providing custom input for MHC alleles
- Refactoring of genotyping scripts
- Added more detailed instructions in README
- Comprehensive workflow with different modules to detect variants from sequencing data
- Different modules for each step
- Support data in single-end, paired-end .fastq or BAM files
- preprocessing, alignment
- genotyping
- alternative splicing
- gene fusion, exitron, SNVs and indels