This is a repository of all the algorithms covered in the Bioinformatics Course part of the Cambridge Computer Science Tripos
For some explanations, check out The Moon
- Needleman-Wunsch
- Calculate LCS and Edit Distance using this approach
- Waterman-Smith
- Affine Gap Model
- Nussinov RNA Folding
- Space Efficient Global Alignment (#todo cleanup)
- Method of Four Russians for LCS
- Extend to Edit Distance, Block Alignment and Global Alignment (for very simple score matrices)
- The CLUSTAL W Mutliple Alignment
- Limb Length - O(n^2) and O(n) solutions
- Additive Phylogeny
- UPGMA - O(N^2 log(n)) solution using priority queue
- Neighbour Joining
- Small Parsimony
- Greedy Heuristic for Large Parsimony
- K-mer Composition
- Overlap Graph Problem
- De Bruijn Graph - create from string, kmers or paired kmers
- Eulerian Problems - finding eulerian cycles and paths.
- Reconstruct Genome / Sequence - reconstruct genome from genome path (path through a De Bruijn Graph), kmers and paired kmers. Construct a k-universal circular string.
- Farthest-Clustering - gets the set of centres found iteratively via max dist of (min dist to a centre)
- Distortion Metric
- Hard K-Means / Lloyd Algorithm
- Soft K-Means - using the parition function with stiffness
- [Hierarchical Clustering]
- [Markov Clustering Algorithm]
- Tries and Suffix Tries - the suffix trie implemented is not the O(n) version which requires pointers etc. It is O(n^2).
- Burrows-Wheeler Transform and Matching
Hidden Markov Models
- HMM Evaluate - find probs of hidden path or visible path given hidden path
- HMM Decode/Viterbi - Viterbi
- HMM Forward
- HMM Backward
- Viterbi Learning
- Baum-Welch
- Demos - A list of functions that can be used to demo the algorithms
- Scoring Functions - Stores different scoring matrices encapsulated into a function to use in alignment problems
- Alignment-Graph- A class for representing alignment/edit graphs
- Rosalind - Answers to the Rosalind Questions