-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lighten logger impact on installation and dynamic import #285
Conversation
Some might say divine, even |
I thought of one more thing, sorry haha
src/itwinai/torch/distributed.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I have one more general question, do you know generally the implications of moving imports into the classes? I wonder in cases such as ray train, where there would be multiple copies of the program running on each of the cores
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it probably doesn't make a difference, because the classes are instantiated also only once per worker...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My general understanding is that the import is only valid in the scope that it's declared, meaning that if you import it in a class or function, then it is only valid in that class or function. That being said, each copy should run "the entire program" anyway, except of course the parts that are hidden behind if-statements (rank == 0
etc.), so I wouldn't think that it's a problem. Not entirely sure of this, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it probably doesn't matter as the classes are each only instantiated once as well actually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering that each copy is a separate process, it should not interfere with inter-process operations. AFAIU, Python will try to dynamically import packages only once per session, so even if you are hiding your import inside multiple functions, it will actually get imported by Python only the first time and just reused the next times other functions import it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks divine to me
Summary
(Hopefully) solves this failing job, since PyPI does not support repositories as direct dependencies (e.g.,
prov4ml@git+https://github.com/matbun/ProvML@new-main
).Moreover, reduced the overhead of dynamic import of
itwinai.loggers
by moving the import of logging frameworks (e.g., mlflow) inside the dedicated itwinai wrapper. This also benefits all the code that depends onitwinai.loggers
(e.g.,itwinai.torch.trainer
), reducing import time.Also, moved the
EmptyLogger
at the bottom ofitwinai.loggers
module.I also took the chance to simplify the imports in the
itwinai.torch.trainer
module.From a simple experiment I ran on Vega's login node, simplifying imports reduced import time of
TorchTrainer
from ~26.6 to ~0.6 seconds:I don't know how relevant/general this result is, but it seems that import time has been reduced.
Related issue :
Related to #283
Closes #278