Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loggers refactoring #283

Open
2 of 6 tasks
matbun opened this issue Jan 9, 2025 · 0 comments
Open
2 of 6 tasks

Loggers refactoring #283

matbun opened this issue Jan 9, 2025 · 0 comments

Comments

@matbun
Copy link
Collaborator

matbun commented Jan 9, 2025

Improvements to the loggers

  • Extract Logger and LoggerMixin from the loggers file, to avoid loading all the dependencies of all logging frameworks when you just need to import the abstract base class or the mixin. Alternatively, move the import of logging frameworks (e.g., mlflow) inside itwinai's wrapper classes.
  • Make Prov4ML an optional dependency #278 is somehow aligned with the above. Prov4ML (now yProvML) has many deps, slowing down the installation and dynamic import at runtime
  • is there a way to integrate Ray's reporting mechanism in the call to self.log(...) in the trainer? Example: if I am in a Ray trial, then also report the values I am logging with kind="metric". There may be better ways to do it...
  • Simplify log_freq, splitting it into log_freq_epoch (int) and log_freq_batch (int). In both cases freq=0 means no logging.
  • Checkpoint and resume full training state from checkpoint (epoch number, model weights, optim, lr scheduler). Some of it is already in place, but I am not sure whether we are currently able to resume training from a checkpoint.
  • Refine common structure (e.g., save_dir)

@annaelisalappe please add other things you had in mind

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants