This method can be used to calculate damage profiles of mapped ancient DNA reads.
Main author: Judith Neukamm [email protected]
Contributor: Alexander Peltzer, James A. Fellows Yates, and Alexander Hübner.
If you use the tool, please cite the publication:
DamageProfiler Neukamm, J., Peltzer, A., & Nieselt, K. (2020). DamageProfiler: Fast damage pattern calculation for ancient DNA. Bioinformatics (btab190). https://doi.org/10.1093/bioinformatics/btab190
DamageProfiler calculates damage profiles of mapped reads and provides a graphical as well as text based representation.
It creates
- damage plots
- fragment length distribution
- read identity distribution
- base frequency table of reference
- table of different base misincorporations and their occurrences
java -jar DamageProfiler-VERSION.jar -i input_file -o output_folder [options]
Running the jar file without any parameter starts the GUI to configure the run.
-h, --help
Shows this help page.
-v, --version
Shows the version of DamageProfiler.
-i INPUT
The input sam/bam/cram file (Required).
-o OUTPUT
The output folder (Required).
-r REFERENCE
The reference file (fasta format).
-t THRESHOLD
DamagePlot: Number of bases which are considered for plotting nucleotide misincorporations. Default: 25.
-s SPECIES
Reference sequence name (Reference sequence name (SN tag) of the SAM record). Species must be put in
quotation marks (e.g. -s 'NC_032001.1|tax|1917232|'), multiple species must be comma separated
(e.g. -s 'NC_032001.1|tax|1917232|,NC_031076.1|tax|1838137|,NC_034267.1|tax|1849328|'). Commas within the reference
sequence name are not allowed. Please specify either -s or -sf.
-sf FILE SPECIES
Text file containing a list with species (Reference sequence name (SN tag) of the SAM record, one per line) for which
damage profile has to be calculated. Please specify either -s or -sf.
-l LENGTH
Number of bases which are considered for frequency computations. Default: 100.
-title TITLE
Title used for all plots. Default: input filename.
-yaxis_dp_max MAX_VALUE
DamagePlot: Maximal y-axis value.
-color_c_t COLOR_C_T
DamagePlot: Color (HEX code) for C to T misincorporation frequency.
-color_g_a COLOR_G_A
DamagePlot: Color (HEX code) for G to A misincorporation frequency.
-color_insertions COLOR_C_T
DamagePlot: Color (HEX code) for base insertions.
-color_deletions COLOR_DELETIONS
DamagePlot: Color (HEX code) for base deletions.
-color_other COLOR_OTHER
DamagePlot: Color (HEX code) for other bases different to reference.
-only_merged
Use only mapped and merged (in case of paired-end sequencing) reads to calculate damage plot instead of using all mapped reads. The SAM/BAM entry must start with 'M_', otherwise it will be skipped. Default: false.
-sslib
Single-stranded library protocol was used. Default: false.
A more detailed documentation of DamageProfiler is available at https://damageprofiler.readthedocs.io.