From 625db43439145fea4a4d0ecbad82c7657a22073f Mon Sep 17 00:00:00 2001 From: sulagnadg Date: Wed, 28 Aug 2024 20:02:17 +0200 Subject: [PATCH] Update needleman.md --- bibook/pairwise/needleman.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/bibook/pairwise/needleman.md b/bibook/pairwise/needleman.md index abc435d..fe44dd9 100644 --- a/bibook/pairwise/needleman.md +++ b/bibook/pairwise/needleman.md @@ -167,13 +167,13 @@ align: left Big-O notation is used in computational science for describing how the running time or memory usage of an algorithm scales with a given factor. E.g. if we expect the running time to scale as $g(x)$ we write that the algorithm has complexity $\mathcal{O}(g(x))$. A more formal definition can be found at [wikipedia](https://en.wikipedia.org/wiki/Big_O_notation). -In the case of Needelman-Wunch we see that the number of calculations needed are proportional to the size of the dynamic programming matrix, which equals the product of the lengths of the sequences, M x N. This results in a time complexity of $ \mathcal{O}(MN) $, indicating that the time to complete the task scales proportionally with the product of the lengths of the two sequences. +In the case of Needleman-Wunsch we see that the number of calculations needed are proportional to the size of the dynamic programming matrix, which equals the product of the lengths of the sequences, M x N. This results in a time complexity of $ \mathcal{O}(MN) $, indicating that the time to complete the task scales proportionally with the product of the lengths of the two sequences. In the same way memory usage also scales with $ \mathcal{O}(MN)$, as the scoring matrix used to store intermediate results requires memory proportional to its size. Big-O notation serves as a quick and effective tool for comparing different algorithms. For example, it allows us to see at a glance how the Needleman-Wunsch algorithm compares to other sequence alignment algorithms in terms of efficiency. -A useful comparison is the complexity of our initial proposition, to enumerate and calculate the scores for all possible alignments of two sequences. This can be done by calculating the number of alignments with $k$ matches/mis-matches between the two sequences which is ${M \choose k}{N \choose k}$. If we asume that $N>M$ and sum this for all possible values of $k$, we get $\sum_{k=0}^M{M \choose k}{N \choose k}=\sum_{k=0}^M{M \choose M-k}{N \choose k}={N+M \choose M}=\frac{(M+N)!}{M!*N!}$ number of different aligments. This can be [shown](https://math.stackexchange.com/a/4134185) to follow $\mathcal{O}((\frac{e(N+M)}{M})^M)$ {cite}`lange2002mathematical, eddy2004dynamic`. +A useful comparison is the complexity of our initial proposition, to enumerate and calculate the scores for all possible alignments of two sequences. This can be done by calculating the number of alignments with $k$ matches/mis-matches between the two sequences which is ${M \choose k}{N \choose k}$. If we assume that $N>M$ and sum this for all possible values of $k$, we get $\sum_{k=0}^M{M \choose k}{N \choose k}=\sum_{k=0}^M{M \choose M-k}{N \choose k}={N+M \choose M}=\frac{(M+N)!}{M!*N!}$ number of different alignments. This can be [shown](https://math.stackexchange.com/a/4134185) to follow $\mathcal{O}((\frac{e(N+M)}{M})^M)$ {cite}`lange2002mathematical, eddy2004dynamic`. ```{bibliography} :filter: docname in docnames