Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Adapt Makers to use pydantic models instead of yamls #307

Draft
wants to merge 49 commits into
base: main
Choose a base branch
from

Conversation

naik-aakash
Copy link
Collaborator

@naik-aakash naik-aakash commented Dec 23, 2024

Closes #305 , #329, #324 and #330

Changes

  • providing default config file arg seems redundant (Maybe remove)
  • check if MLIPFitMaker also needs this change
  • Add type checks before updating kwargs of config (use pydantic to achieve this automatically)
  • add test to ensure RssConfig custom model work as intended
  • add MLIP hypers pydantic models
  • use pydantic models instead of load yamls for defaults (adapt fitting jobs / flows)
  • Move database_dir arg back to Make see RssMaker does not work as expected #329 (comment)
  • Extend MLIP hyper parameter models where possible (Maybe MACE for now > also no typechecking for fine-tuning hypers for MACE at this point in models)
  • remove use defaults arg of MLIPFitMaker/ machine learning fit method
  • Check if nequip_fitting function can be cleaned up
  • Add hypeparameters arg to machine_learning_fit function
  • Think if value Error should be raised or a warning when unexpected kwarg gets passed for fits.
  • Figure out missing args in RssMaker doc-strings and placement of args (need some input from @YuanbinLiu)
  • Update fine-tuning docs for MACE
  • Update RssMaker documentation to reflect changes
  • Update AIRSS install instructions

@naik-aakash naik-aakash marked this pull request as draft December 23, 2024 06:26
@naik-aakash naik-aakash changed the title [WIP] load yamls using post_init > no need to copy to remote server [WIP] load user provided yamls using __post_init__ > avoids need to also have that file on remote server Dec 23, 2024
@naik-aakash naik-aakash added documentation Improvements or additions to documentation enhancement New feature or request labels Dec 23, 2024
@naik-aakash naik-aakash self-assigned this Dec 23, 2024
@JaGeo
Copy link
Collaborator

JaGeo commented Dec 23, 2024

Shouldn't we rather do something similar to atomate2 settings?

@JaGeo
Copy link
Collaborator

JaGeo commented Dec 23, 2024

@naik-aakash
Copy link
Collaborator Author

naik-aakash commented Dec 23, 2024

Hi @JaGeo , i think both above approach we can use to load the default yamls, which I plan to do next : similar to atomate2 settings

But we would still need postinit in the maker if we expect user to provide modified yaml path (so it gets read when the flow is created) and we don't want to copy that yamls to remote cluster. Due to nature of jobflow delayed execution

@JaGeo
Copy link
Collaborator

JaGeo commented Dec 23, 2024

@naik-aakash okay! Likely pymatgen approach is th closest to what we want for defaults

@JaGeo
Copy link
Collaborator

JaGeo commented Dec 23, 2024

Another thought: post init will not work if you initialize the flow during run time

@naik-aakash naik-aakash changed the title [WIP] load user provided yamls using __post_init__ > avoids need to also have that file on remote server Load user provided yamls using __post_init__ > avoids need to also have that file on remote server Jan 8, 2025
@naik-aakash naik-aakash marked this pull request as ready for review January 8, 2025 16:38
@naik-aakash naik-aakash changed the title Load user provided yamls using __post_init__ > avoids need to also have that file on remote server [WIP] Load user provided yamls using __post_init__ > avoids need to also have that file on remote server Jan 9, 2025
@naik-aakash naik-aakash marked this pull request as draft January 9, 2025 10:37
@QuantumChemist
Copy link
Collaborator

Hi @JaGeo , @QuantumChemist , I have started a bit to replace use of default jsons with pydantic models. (Just done for GAP and J-ACE) If you think the way it is done is fine, I will do the same for all other methods. If you have any comments or suggestions let me know, happy to address it before I make lot of changes.

We now have type checks already as part of Models with update_fields method (handy when updating defaults with user supplied kwargs)

It looks really good to me!
The only thing I wonder is if we want still keep the possibility that users can also pass the hyperparas via a file? Let's say someone has a certain default setup and wants to submit a lot of wfs with theses same settings? Putting the hyperparas in the submission script every time seems rather inconvenient in this case. What do you think?

@naik-aakash
Copy link
Collaborator Author

naik-aakash commented Jan 10, 2025

Hi @JaGeo , @QuantumChemist , I have started a bit to replace use of default jsons with pydantic models. (Just done for GAP and J-ACE) If you think the way it is done is fine, I will do the same for all other methods. If you have any comments or suggestions let me know, happy to address it before I make lot of changes.
We now have type checks already as part of Models with update_fields method (handy when updating defaults with user supplied kwargs)

It looks really good to me! The only thing I wonder is if we want still keep the possibility that users can also pass the hyperparas via a file? Let's say someone has a certain default setup and wants to submit a lot of wfs with theses same settings? Putting the hyperparas in the submission script every time seems rather inconvenient in this case. What do you think?

I think files are be problematic when using remote submissions, user then needs to copy the file, I feel it is much more convenient to rather create MlLIP_HYPERS object from a custom hyper parameter args and simply pass this instance to the workflow. I still have to test this though with a remote submission if it really works as I think it would.

Btw, Option to create this object from file is on my todo

@JaGeo
Copy link
Collaborator

JaGeo commented Jan 15, 2025

Thanks for fixing the bug well @naik-aakash 😉.

@@ -17,7 +17,7 @@ def test_gap_fit_maker(test_dir, memory_jobstore, clean_dir):
database_dir=database_dir
)

responses = run_locally(
_ = run_locally(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we just do run_locally(...)?

Copy link
Collaborator Author

@naik-aakash naik-aakash Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests can be made better in future by accessing the responses and assertions.

I can remove it yes. That's true

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up to you, just wondered about it 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Better handling of user provided yaml files
3 participants