-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add compile_fn parameter for Trainer #20269
base: master
Are you sure you want to change the base?
Add compile_fn parameter for Trainer #20269
Conversation
Both benchmarks checks failed due to timeout |
bump |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #20269 +/- ##
=========================================
- Coverage 89% 81% -8%
=========================================
Files 267 264 -3
Lines 23084 23032 -52
=========================================
- Hits 20585 18620 -1965
- Misses 2499 4412 +1913 |
4648ea2
to
925c376
Compare
Thank you @mieshkiwrk. The way we recommend users to use torch.compile with lightning is to call torch.compile on the model and then pass it to the trainer. import torch
import lightning as L
model = MyLightningModule()
model = torch.compile(model)
trainer = L.Trainer()
trainer.fit(model) This PR would add an additional entrypoint and there's probably a simpler way to go about it (for users). We should replicate what where we capture the arguments passed to torch.compile so we can re-apply it when using strategies, just like we do in Fabric but in the Trainer: Would you like to take a stab at it? |
Let me try, looks like I see what's needed to be done |
for more information, see https://pre-commit.ci
@lantiga, something like this would be fine? Wanted to make sure about re-using |
hey, thanks for updating the PR |
Thanks for the update! Can you add tests, ideally testing that DDP and FSDP behave correctly when applied? Later on (not for this PR), we should also look into ModelParallelStrategy, but that's for later. |
Sure it's wip, it will just take a while due to limited free time |
@lantiga Added 2 tests for DDP/FSDP, CI is failing due to: Locally added tests passed for me on CPU, I dont have nvidia gpu for validation |
Add support for compile_fn for Trainer for example to compile model after applying strategy
Example usage: needed to compile after applying DDP strategy to get pre/post forward also compiled
Fixes #20242
📚 Documentation preview 📚: https://pytorch-lightning--20269.org.readthedocs.build/en/20269/