PTM Dephosphorylation Predictation Tool #1525

haibkhn · 2024-10-16T06:56:27Z

This pull request introduces a new tool for predicting dephosphorylation. Key features include:

Two modes of operation:
1. Manual hyperparameter selection
2. Automated hyperparameter search using Optuna
Support for 3 protein language model variations:
- ProtT5-XL-UniRef50
- ESM
- ProtT5-XL-BFD
Additional hyperparameter search option:
- SMAC functionality included, but note that the latest SMAC version (2.2.0) is not yet updated in Anaconda

@anuprulez Please review these changes and let me know if any modifications are needed.

…dephos_predict

anuprulez · 2024-10-16T08:17:59Z

@haibkhn thanks for the PR. I will have a look and provide my feedback

anuprulez · 2024-10-16T08:23:05Z

Here are a few obvious comments to start with:

The tool does not have automatic tests. Please see here: https://github.com/bgruening/galaxytools/blob/master/tools/bioimaging/bioimage_inference.xml how to add tool tests
CI's listing test is failing. Please fix those. All the automatic CI tests should pass.

ping @haibkhn
Maybe @bgruening also has more comments. I will look into more details probably end of this week or next week.

bgruening

please add a .shed.yml file

bgruening · 2024-10-16T09:37:50Z

tools/prot_tools/optuna_tool.xml

@@ -0,0 +1,339 @@
+<tool id="hyperparameter_finetune" name="Hyperparameter Search for Finetuning model" version="1.0.0">


Suggested change

<tool id="hyperparameter_finetune" name="Hyperparameter Search for Finetuning model" version="1.0.0">

<tool id="hyperparameter_finetune" name="Hyperparameter Search for Finetuning model" version="1.0.0" profile="23.0">

anuprulez · 2024-10-22T09:40:42Z

I think while testing, the script tries to download the LLM (ProtT5-XL-UniRef50) having a size > 10 GB. It could be the reason CI/CD throws an error related to the memory

It could be possible to provide a remote link to the model but I think HuggingFace does not allow the model from remote. The models should be hosted at HuggingFace, correct? @haibkhn

Moving to a container-based tool might help but not sure. Can the HuggingFace table used with the Flux tool might help?
I am looking at #1496 where we store the names of HF models but not the models themselves

ping @bgruening @arash77

Thanks!

arash77 · 2024-10-22T11:18:26Z

I don't think that it is possible to test the tool completely with the models in GitHub if the model is big. unless there is a smaller version of the model available.
But for production, we can use two methods, one that I have used is to use HF_HOME environment variable to store all hugging face models or you can use cache_dir in from_pretrained function

haibkhn and others added 17 commits August 4, 2024 08:09

add galaxy tool

2a64c61

test user

d1ca895

user

2264f77

currently working with manual and dora

dc36cfb

final draft

cd97a7c

Merge branch 'bgruening:master' into dephos_predict

fe4e499

clean repo

0cfe471

update readme and add option for problem selection

7d757a7

update numlabel and try warning

4db9c42

update warning suppress

7455a6a

update model output

3bde04b

add fasta preprocessor

7483f15

clean code, add example

8998f99

Merge branch 'bgruening:master' into dephos_predict

59bb0aa

intergrate preprocess fasta file into main file

afc2262

Merge branch 'dephos_predict' of github.com:haibkhn/galaxytools into …

91d1b15

…dephos_predict

Merge branch 'bgruening:master' into dephos_predict

7c6af74

bgruening reviewed Oct 16, 2024

View reviewed changes

haibkhn added 4 commits October 17, 2024 00:47

add test

c578e5f

update shed

a742495

format flake python

988120f

flake8 fix

bf00d42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PTM Dephosphorylation Predictation Tool #1525

PTM Dephosphorylation Predictation Tool #1525

haibkhn commented Oct 16, 2024

anuprulez commented Oct 16, 2024

anuprulez commented Oct 16, 2024

bgruening left a comment

bgruening Oct 16, 2024

anuprulez commented Oct 22, 2024

arash77 commented Oct 22, 2024

		@@ -0,0 +1,339 @@
		<tool id="hyperparameter_finetune" name="Hyperparameter Search for Finetuning model" version="1.0.0">

PTM Dephosphorylation Predictation Tool #1525

Are you sure you want to change the base?

PTM Dephosphorylation Predictation Tool #1525

Conversation

haibkhn commented Oct 16, 2024

anuprulez commented Oct 16, 2024

anuprulez commented Oct 16, 2024

bgruening left a comment

Choose a reason for hiding this comment

bgruening Oct 16, 2024

Choose a reason for hiding this comment

anuprulez commented Oct 22, 2024

arash77 commented Oct 22, 2024