Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse settings #101

Merged
merged 23 commits into from
Nov 22, 2023
Merged

Parse settings #101

merged 23 commits into from
Nov 22, 2023

Conversation

enryH
Copy link
Member

@enryH enryH commented Sep 25, 2023

Aim: Parse parameter files of tools and map these to selected parameters for a given task for tool comparison

  • Started with mqpar.xml files.
  • Mappings of keys between tool parameter dictionaries and selected parameters needs to be crafted per task

Issue: #58

Henry added 7 commits September 25, 2023 11:35
- allow longer lines
- code was not black8 formatted (which is mentioned in dev
  dependencies)
- later: map selected entries per task (needs to be done)
- changed to pytest (as it is run in a github action)
py37 support will be drop soon
- potential: allow to combine MaxQuant parameter files
- easy to inspect csv parameter file

ToDo:
- could update the 4th index level to reflect some of the groups
 -> see comments in maxquant.py
Henry Webel added 3 commits October 21, 2023 14:19
- new dependency for excel file reading
- three excel sheets contain information needed
@enryH enryH linked an issue Oct 27, 2023 that may be closed by this pull request
missed_cleavages: Optional[str] = None # allowed_miscleavages
min_pep_length: Optional[str] = None # min_peptide_length
max_pep_length: Optional[str] = None # max_peptide_length
fixed_modifications: Optional[str] = None # fixex_mods
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be 'fixed_mods'

fixed_modifications: Optional[str] = None # fixex_mods
variable_modifications: Optional[str] = None # variable_mods
max_num_modifications: Optional[str] = None # max_mods
precursor_charge: Optional[str] = None # min_precursor_charge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we don't enter all the possible precursor charges, then we should have min_ and max_precursor_charge

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the maximum can be unlimited in some tools?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure. I guess that if this happens, we'll have a missing value there.
What I meant was more that in cases like in MQ, where they give all the charges that they consider, we need to get the min. and max. No?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MQ specifies a range of min. and max. charge. We only have a problem if a tools would e.g. allow peptides with charges 2,3 and 5, but excludes charges with 4. But I guess this is quite uncommon?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, you are absolutely right. There could be an issue in this case. I don't like this.
Do you think that it is/was/should be discussed for the SDRF-extended format? Because this problem can also arise for them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case yes, as we should not diverge from SDRF-extended.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two parameters missing, and I did not find the correspondance in the SDRF doc. But maybe I missed it:
'software_name' for the software tool (that combines a search engine and some algorithms for quantification)
'search_engine_version'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, me neither. I guess this is something SDRF does intend to have.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you check with Veit? Or anybody else? I think that we don't take any risk any way if we name them this way, No?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, we can decide the name for parameters specific to our needs. SDRF is independent of software tools, so it would not make sense to have this kind of information there.

Henry added 2 commits November 11, 2023 17:09
- Update ProteBenchParameters to reflect SDRF extended
  naming
- Update Proline parameter extraction

Could update DataPoint where it matches 1:1 old and new,
but external changes needed first:
https://github.com/Proteobench/Results_Module2_quant_DDA
- test uses external git repo for validation...
- most selected parameters are easy to get
- differences between version 1.6 and higher to previous (1.5)
- fragment_mass_tolerance
   -> which fragenation method was used?
   -> missing information in extracted data for v1.5
@enryH
Copy link
Member Author

enryH commented Nov 13, 2023

One more question regarding fragement_mass_tolerance. Which fractionation technique was used? MQ seems to save several settings for different methods.

And it also has two version in our 3 xml files

1.6.0.0 and higher.
https://github.com/Proteobench/ProteoBench/blob/682825d27f5fb85692e6f2884b40d79334234920/test/params/mqpar_MQ1.6.3.3_MBR.xml#L384C1-L465C31

And before:
https://github.com/Proteobench/ProteoBench/blob/682825d27f5fb85692e6f2884b40d79334234920/test/params/mqpar1.5.3.30_MBR.xml#L220C1-L249C22

@mlocardpaulet
Copy link
Contributor

One more question regarding fragement_mass_tolerance. Which fractionation technique was used? MQ seems to save several settings for different methods.

And it also has two version in our 3 xml files

1.6.0.0 and higher. https://github.com/Proteobench/ProteoBench/blob/682825d27f5fb85692e6f2884b40d79334234920/test/params/mqpar_MQ1.6.3.3_MBR.xml#L384C1-L465C31

And before: https://github.com/Proteobench/ProteoBench/blob/682825d27f5fb85692e6f2884b40d79334234920/test/params/mqpar1.5.3.30_MBR.xml#L220C1-L249C22

Maybe to check with Bart, but I'd say that you take the "FTMS" params. Is it what you asked?

@enryH
Copy link
Member Author

enryH commented Nov 14, 2023

Yes, I would like to have the information double check w.r.t to all tolerances. If it is FTMS, is this also be noted somewhere in the parameter file?
@brvpuyve Could you check which tolerances you would select for reporting in the two above mentioned MQ parameter files?

@mlocardpaulet
Copy link
Contributor

Yes, I would like to have the information double check w.r.t to all tolerances. If it is FTMS, is this also be noted somewhere in the parameter file? @brvpuyve Could you check which tolerances you would select for reporting in the two above mentioned MQ parameter files?

MQ will look for the in formation in the raw file (for the FTMS). And here we used an orbitrap so this should be it.

Henry added 3 commits November 19, 2023 17:39
- from 1.6 onwords, information is given explicitly
- MQ stores several settings in parameter file, which are then applied
  based on information stored in the rawfile metadata
- see open PR #95
@enryH enryH marked this pull request as ready for review November 22, 2023 09:22
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this not specifically removed by @RalfG?

@RobbinBouwmeester RobbinBouwmeester merged commit 902dd1b into main Nov 22, 2023
4 checks passed
@RobbinBouwmeester RobbinBouwmeester deleted the parse_settings branch November 22, 2023 14:33
@enryH
Copy link
Member Author

enryH commented Nov 22, 2023

No you were fast:) I'll add one test later, but this is only to keep track of changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement support for uploading (and storing) workflow configuration files
3 participants