this project support deepspeed pipeline #1248
Replies: 3 comments 4 replies
-
please try and complete your thoughts before submitting issue reports or discussions. this doesnt contain information others would need to help you. i mentioned before that you need to at the least submit a config file and the steps taken to lead to the problem. read your own discussion post back; how do you expect anyone to help? |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
when I use fsdp config training: when I enter ctrl+c to force quite ^CW1228 21:03:38.744000 2006575 torch/distributed/elastic/agent/server/api.py:704] Received Signals.SIGINT death signal, shutting down workers During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): |
Beta Was this translation helpful? Give feedback.
-
when I creat a config json file to deepspeed, the memory out of also. and seem no part the model
Beta Was this translation helpful? Give feedback.
All reactions