-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PicklingError #740
Comments
same issue here, got the error |
|
Actually i never used docker image. I am directly copy blocks of code from notebooks directly into Colab. I have also a proble m about loading models like training in CPU loading in GPU or vice versa using cloudpickle. It says something about no cudf found for example in CPU machine. |
setting |
I got the same error as well. @suyee97 Could you please share where to load model class and how? And if I use the code from notebook 02 to save model:
Same error observed. |
@ghisloine we designed the examples to run on GPU. so it is normal you are getting
otherwise you cannot run the examples on GPU. Are you able to use examples on CPU currently? If you want to run on CPU, you dont need to install https://github.com/NVIDIA-Merlin/Transformers4Rec/blob/main/tests/unit/torch/test_trainer.py#L74-L76 |
After your first suggestion, i am trying to use |
Thank you for the question @peterkim95 !
To load a general checkpoint, PyTorch provides a built-in function
load_state_dict
that you can call as follow:In Transformers4rec, we additionally simplified the model saving in the transformers4rec Trainer class (here) with a builtin method
_save_model_and_checkpoint
where you can save the checkpoints but the model class as well. By doing so, you don't have to re-define the model class.An example of usage would be:
Let us know if those examples help you with your use-case :)
Originally posted by @sararb in #348 (comment)
When i am try this code, getting this error: PicklingError: Cannot pickle a prepared model with automatic mixed precision, please unwrap the model with
Accelerator.unwrap_model(model)
before pickling it.If i try recsys_trainer.accelerator.unwrap_model and save again it is saving but at this time i am getting model.forward() missing 1 required positional argument: 'inputs'
My main aim is saving model and using like recsys_trainer.predict() it in another platform without triton server and GPU.
The text was updated successfully, but these errors were encountered: