Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the aggregate module and mokapot postprocessing. #16

Draft
wants to merge 47 commits into
base: main
Choose a base branch
from
Draft
Changes from 1 commit
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
62d40e2
(wip) changed ims collapsing
Mar 6, 2023
d6946f7
(wip) changed parameters on IMS collapsing and deisotoping
Mar 9, 2023
7581b30
(wip) fixed bug where empty spectra would crash the program
Mar 9, 2023
fd8d2e1
(wip) further improvememet on timstof data
Mar 13, 2023
e6518da
(wip) changed location for deisotoping
Mar 16, 2023
ec45f98
(wip) adding precursor information to the searches
Mar 22, 2023
7fd636a
new deisotoping abstractions
Apr 5, 2023
d37cdd4
initial rt model
cia23 Apr 11, 2023
9502f4d
(wip,bugfix) changed neighborhood search and fixed splitting of TIMS …
Apr 18, 2023
978e618
(bugfix) fixed bug on deisotoping indexing
Apr 20, 2023
bb7a4ce
(wip) progress todards merge
Apr 25, 2023
b703734
(wip) ruff compliance
Apr 25, 2023
2018ea0
(wip) ruff compliance
Apr 26, 2023
12c57b8
(wip) prettyfied deisotoping testing
Apr 26, 2023
ffc1ccb
added ims and rt info to the output file
Apr 26, 2023
f5c1b8d
removed some dead code
Apr 26, 2023
fb20fec
Added run-level mokapot and updated rt
wfondrie Apr 26, 2023
54684f9
better top rank handling
jspaezp Apr 26, 2023
6591a3c
fixed unit testing with the merge
jspaezp Apr 26, 2023
53ebe2d
remove search_utils.py
wfondrie Apr 26, 2023
07537e1
removed legacy make pin function
jspaezp Apr 26, 2023
20de910
Add sklearn
wfondrie Apr 26, 2023
f13ff27
Merge branch 'aggregate' of github.com:TalusBio/diadem into aggregate
jspaezp Apr 26, 2023
8807b12
Aggregate into bruker branch (#17)
jspaezp Apr 26, 2023
5b9cf05
Merge branch 'feature/bruker_support_alt_deisotope' of github.com:Tal…
jspaezp Apr 26, 2023
552aefd
Updated imputer models and added tests
wfondrie Apr 27, 2023
5b25cf5
Merge branch 'aggregate' of github.com:TalusBio/diadem into aggregate
wfondrie Apr 27, 2023
e2d40b4
partial migration to polars and partial implementation of caching
jspaezp Apr 28, 2023
d4ec9e6
further additions to spectrum caching
jspaezp Apr 28, 2023
dd25333
Add decoys
wfondrie May 1, 2023
2d03ef8
(wip,bugfix,feature) initial addition of dvc workflows for benchmarki…
jspaezp May 1, 2023
797b14e
Drafted global FDR and alignment
wfondrie May 3, 2023
2be6891
Update to work with dataframe inputs as well
wfondrie May 3, 2023
8132ff3
(wip) addition of fragment position information
jspaezp May 8, 2023
43bcc05
Added base interface
wfondrie May 8, 2023
ca7a5eb
Fixed interface tests
wfondrie May 8, 2023
78a4f68
moved test
wfondrie May 8, 2023
977c9a3
added correlation filtering, parquet caching and improvement of data…
jspaezp May 11, 2023
6d77450
updated dvc
jspaezp May 11, 2023
e9e6781
updated dvc run info
jspaezp May 11, 2023
0b3d434
updated dvc run info
jspaezp May 11, 2023
510b755
minor docsting change
jspaezp May 11, 2023
1c16fc6
updated run info
jspaezp May 11, 2023
bd4aab4
updated metrics and fixed mzml
jspaezp May 13, 2023
77a5dc0
Updated base interface
wfondrie May 13, 2023
f932eab
Merge branch 'aggregate' into feature/bruker_support_alt_deisotope
wfondrie May 13, 2023
dac6245
Merge pull request #22 from TalusBio/feature/bruker_support_alt_deiso…
wfondrie May 13, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions diadem/search/mokapot.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,15 @@
import mokapot
import pandas as pd

from diadem.config import DiademConfig
from diadem.index.protein_index import ProteinNGram


def brew_run(
results: pd.DataFrame,
fasta_path: PathLike,
ms_data_path: PathLike,
config: DiademConfig,
) -> pd.DataFrame:
"""Prepare the result DataFrame for mokapot.

Expand All @@ -25,6 +27,8 @@ def brew_run(
The FASTA file that was used for the search.
ms_data_path : PathLike
The mass spectrometry data file that was searched.
config : DiademConfig
The configuration setting.

Returns
-------
Expand All @@ -50,8 +54,14 @@ def brew_run(
filename_column="filename",
copy_data=False,
)
results = mokapot.brew(peptides)
return results.peptides

mokapot.PercolatorModel(train_fdr=config.train_fdr)
results = mokapot.brew(peptides, test_fdr=config.eval_fdr)
targets = results.confidence_estimates["peptides"]
decoys = results.decoy_confidence_estiamtes["peptides"]
targets["is_target"] = True
decoys["is_target"] = False
return pd.concat([targets, decoys], axis=1)
wfondrie marked this conversation as resolved.
Show resolved Hide resolved


def _prepare_df(
Expand Down