Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lighten logger impact on installation and dynamic import #285

Merged
merged 6 commits into from
Jan 10, 2025

Conversation

matbun
Copy link
Collaborator

@matbun matbun commented Jan 10, 2025

Summary

(Hopefully) solves this failing job, since PyPI does not support repositories as direct dependencies (e.g., prov4ml@git+https://github.com/matbun/ProvML@new-main).

Moreover, reduced the overhead of dynamic import of itwinai.loggers by moving the import of logging frameworks (e.g., mlflow) inside the dedicated itwinai wrapper. This also benefits all the code that depends on itwinai.loggers (e.g., itwinai.torch.trainer), reducing import time.

Also, moved the EmptyLogger at the bottom of itwinai.loggers module.

I also took the chance to simplify the imports in the itwinai.torch.trainer module.

From a simple experiment I ran on Vega's login node, simplifying imports reduced import time of TorchTrainer from ~26.6 to ~0.6 seconds:

start = time.time()
from itwinai.torch.trainer import TorchTrainer
print(f"time spent importing itwinai: {time.time() - start:.6f}")

I don't know how relevant/general this result is, but it seems that import time has been reduced.


Related issue :

Related to #283
Closes #278

@matbun matbun added enhancement New feature or request dependencies Pull requests that update a dependency file labels Jan 10, 2025
@matbun matbun self-assigned this Jan 10, 2025
@matbun matbun requested review from jarlsondre and annaelisalappe and removed request for jarlsondre January 10, 2025 09:38
annaelisalappe
annaelisalappe previously approved these changes Jan 10, 2025
@annaelisalappe annaelisalappe self-requested a review January 10, 2025 12:48
@jarlsondre
Copy link
Collaborator

Looks god to me :)

Some might say divine, even

@annaelisalappe annaelisalappe dismissed their stale review January 10, 2025 12:51

I thought of one more thing, sorry haha

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I have one more general question, do you know generally the implications of moving imports into the classes? I wonder in cases such as ray train, where there would be multiple copies of the program running on each of the cores

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it probably doesn't make a difference, because the classes are instantiated also only once per worker...

Copy link
Collaborator

@jarlsondre jarlsondre Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My general understanding is that the import is only valid in the scope that it's declared, meaning that if you import it in a class or function, then it is only valid in that class or function. That being said, each copy should run "the entire program" anyway, except of course the parts that are hidden behind if-statements (rank == 0 etc.), so I wouldn't think that it's a problem. Not entirely sure of this, though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it probably doesn't matter as the classes are each only instantiated once as well actually

Copy link
Collaborator Author

@matbun matbun Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that each copy is a separate process, it should not interfere with inter-process operations. AFAIU, Python will try to dynamically import packages only once per session, so even if you are hiding your import inside multiple functions, it will actually get imported by Python only the first time and just reused the next times other functions import it.

jarlsondre
jarlsondre previously approved these changes Jan 10, 2025
Copy link
Collaborator

@jarlsondre jarlsondre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks divine to me

@matbun matbun merged commit 3d029b3 into main Jan 10, 2025
10 checks passed
@matbun matbun deleted the lighten-loggers branch January 10, 2025 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make Prov4ML an optional dependency
3 participants