Example code to read .mgf file #44

GuptaVishu2002 · 2023-12-10T04:06:15Z

Hi, I would like to know how to read and preprocess a .mgf file using the package. Can you please help me by providing an example code for that, which can then be used to pass on other package functions such as Encoder and Transformer? Thank You

bittremieux · 2023-12-11T08:12:32Z

You can read and parse spectra from MGF files using the Dataset functionality. It's as straightforward as this:

from depthcharge.data. import pectrumDataset

dataset = SpectrumDataset("my_file.mgf", "my_file.lance")

This can then be used as any general PyTorch dataset to provide to your model for training, validation, or testing. How you do this specifically depends on how you use PyTorch, Lightning, etc.

Note that the API is currently in heavy development, so there are some breaking changes between various DepthCharge versions. The Lance integration is included in the development version if you install from GitHub, but not in the latest release on PyPI yet.

wfondrie · 2023-12-11T17:19:31Z

Hi @GuptaVishu2002 - I'm planing the next release for after #43 is reviewed and merged and I'm working on documentation this week. Stay tuned!

GuptaVishu2002 · 2023-12-11T18:38:06Z

Hi @bittremieux , @wfondrie - thank you very much for the reply. Looking forward to the updates.

GuptaVishu2002 · 2024-01-23T03:43:09Z

Hi @bittremieux @wfondrie, I hope you are doing well. Would it be possible for you to give a sample code on the recommended way to incorporate arbitrary information (such as precursor_mz, precursor_charge) into the spectrum representation for the transformer (via subclassing of SpectrumTransformerEncoder class and overwriting the precursor_hook() method)? Thank You.

wfondrie · 2024-01-23T17:00:31Z

Hi @GuptaVishu2002 - sorry for the delay! We're still trying to merge a major PR, then I'll get cracking on refreshed and more detailed documentation. Thanks.

For now, the best place to learn how to use the precursor is to look at the unit tests:

depthcharge/tests/unit_tests/test_transformers/test_spectrum_transformers.py

Lines 46 to 68 in bd2861f

    
           def test_precursor_hook(batch): 
        
               """Test that the hook works.""" 
        
               class MyEncoder(SpectrumTransformerEncoder): 
        
                   """A silly class.""" 
        
                   def precursor_hook(self, mz_array, intensity_array, **kwargs): 
        
                       """A silly hook.""" 
        
                       return kwargs["charge"].expand(self.d_model, -1).T 
        
               model1 = MyEncoder(8, 1, 12) 
        
               emb1, mask1 = model1(**batch) 
        
               assert emb1.shape == (2, 4, 8) 
        
               assert mask1.sum() == 1 
        
               model2 = SpectrumTransformerEncoder(8, 1, 12) 
        
               emb2, mask2 = model2(**batch) 
        
               assert emb2.shape == (2, 4, 8) 
        
               assert mask2.sum() == 1 
        
               for elem in zip(emb1.flatten(), emb2.flatten()): 
        
                   if elem: 
        
                       assert elem[0] != elem[1]

wfondrie · 2024-04-19T22:33:13Z

I haven't specifically added how to read an MGF file, but I just added documentation in #47 about working with mass spec data in general. Have a look and let me know if you have other questions!

wfondrie closed this as completed Apr 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example code to read .mgf file #44

Example code to read .mgf file #44

GuptaVishu2002 commented Dec 10, 2023 •

edited

Loading

bittremieux commented Dec 11, 2023

wfondrie commented Dec 11, 2023

GuptaVishu2002 commented Dec 11, 2023

GuptaVishu2002 commented Jan 23, 2024 •

edited

Loading

wfondrie commented Jan 23, 2024

wfondrie commented Apr 19, 2024

Example code to read .mgf file #44

Example code to read .mgf file #44

Comments

GuptaVishu2002 commented Dec 10, 2023 • edited Loading

bittremieux commented Dec 11, 2023

wfondrie commented Dec 11, 2023

GuptaVishu2002 commented Dec 11, 2023

GuptaVishu2002 commented Jan 23, 2024 • edited Loading

wfondrie commented Jan 23, 2024

wfondrie commented Apr 19, 2024

GuptaVishu2002 commented Dec 10, 2023 •

edited

Loading

GuptaVishu2002 commented Jan 23, 2024 •

edited

Loading