neodisambiguate

Disambiguate reads that were mapped to multiple references.

Install with the Conda or Mamba package manager after setting your Bioconda channels:

❯ conda install neodisambiguate

Introduction

Alignment disambiguation is commonly performed on sequencing data from transduction, transfection, transgenic, or xenographic (including patient derived xenograft) experiments. This tool works by comparing various alignment metrics between a template that has been aligned to many different references in order to determine which reference is the most likely source. Disambiguation of aligned reads is made per-template and information across primary, secondary, and supplementary alignments is used as evidence.

All templates which are positively assigned to a single source reference are written to a reference-specific output BAM file. Any templates with ambiguous reference assignment are written to an ambiguous input-specific output BAM file.

Only BAMs produced from the Burrows-Wheeler Aligner (bwa) or STAR are currently supported. Input BAMs of arbitrary sort order are accepted, however, an internal sort to queryname will be performed unless the BAM is already in queryname sort order. All output BAM files will be written in the same sort order as the input BAM files. Although paired-end reads will give the most discriminatory power for disambiguation of short-read sequencing data, this tool accepts paired, single-end (fragment), and mixed pairing input data.

Features

Accepts SAM/BAM sources of any sort order
Will disambiguate an arbitrary number of BAMs, all aligned to different references
Writes the ambiguous alignments to a separate directory
Extensible implementation which supports alternative disambiguation strategies
Benchmarks show high accuracy: Click Here

Command Line Usage

❯ neodisambiguate -i infile1.bam infile2.bam -o out/disambiguated

Example Usage

To disambiguate templates for sample dna00001 that are aligned to human (A) and mouse (B):

❯ neodisambiguate -i dna00001.A.bam dna00001.B.bam -o out/dna00001 -n hg38 mm10

❯ tree out/
  out/
  ├── ambiguous-alignments/
  │  ├── dna00001.A.ambiguous.bai
  │  ├── dna00001.A.ambiguous.bam
  │  ├── dna00001.B.ambiguous.bai
  │  └── dna00001.B.ambiguous.bam
  ├── dna00001.hg38.bai
  ├── dna00001.hg38.bam
  ├── dna00001.mm10.bai
  └── dna00001.mm10.bam

Local Installation

Bootstrap compilation and build the executable with:

./mill neodisambiguate.executable
./bin/neodisambiguate --help

Prior Art

This project was inspired by AstraZeneca's disambiguate:

https://github.com/AstraZeneca-NGS/disambiguate

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github		.github
benchmarks		benchmarks
neodisambiguate		neodisambiguate
.gitignore		.gitignore
.mill-version		.mill-version
LICENSE		LICENSE
README.md		README.md
build.sc		build.sc
mill		mill

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

neodisambiguate

Introduction

Features

Command Line Usage

Example Usage

Local Installation

Prior Art

About

Releases 2

Languages

License

clintval/neodisambiguate

Folders and files

Latest commit

History

Repository files navigation

neodisambiguate

Introduction

Features

Command Line Usage

Example Usage

Local Installation

Prior Art

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Languages