Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! #1

Open
landiaokafeiyan opened this issue Feb 10, 2023 · 7 comments

Comments

@landiaokafeiyan
Copy link

Hi there,

Thanks for your excellent work. I have this problem when I train and test your code. Do you have any idea what is wrong? Since I find that the data and model are all in cuda.

Thanks in advance!

@afpapqy
Copy link

afpapqy commented Feb 10, 2023

I solved that through transferring it into toch.nn.parallel.DistributedDataParallel.
However, I met another CUDA Memory Error:

torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 11.91 GiB total capacity; 10.99 GiB already allocated; 3.88 MiB free; 11.07 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@landiaokafeiyan
Copy link
Author

I think you have to reduce the batch size, even though I have 2 X2080ti , batch-size set as 2

@afpapqy
Copy link

afpapqy commented Feb 13, 2023

I think you have to reduce the batch size, even though I have 2 X2080ti , batch-size set as 2

My GPU is a Titan Xp with 12GB of memory, and the image size is 576*576, but I still get a "out of memory" error even when I set the batch size to 1.

@anhquyetnguyen
Copy link

I am facing it, can you share solution? @afpapqy @landiaokafeiyan

@PigBroA
Copy link

PigBroA commented Feb 28, 2023

I modified little bit and I can run without device error

in models/swin_transformer_v2.py line 294
original: logit_scale = torch.clamp(self.logit_scale, max=torch.log(torch.tensor(1. / 0.01))).exp()
modified: logit_scale = torch.clamp(self.logit_scale, max=torch.log(torch.tensor(1. / 0.01).to('cuda:0'))).exp()

this is an example. You can get another variable to change the tensor's device status.

@landiaokafeiyan
Copy link
Author

Hi @afpapqy @PigBroA
when I test the image with 3000x4000, I have to piece the image into several patches which will decrease the performance. Do you have any good ideas to slove this problem?

Thanks in advance.

@kmbmjn
Copy link

kmbmjn commented May 8, 2023

I modified little bit and I can run without device error

in models/swin_transformer_v2.py line 294 original: logit_scale = torch.clamp(self.logit_scale, max=torch.log(torch.tensor(1. / 0.01))).exp() modified: logit_scale = torch.clamp(self.logit_scale, max=torch.log(torch.tensor(1. / 0.01).to('cuda:0'))).exp()

this is an example. You can get another variable to change the tensor's device status.

Thank you for this solution!
For the multi-GPU environment, I encountered another error with "cuda:0" and "cuda:1" and alternatively, I used the following modification:

original: logit_scale = torch.clamp(self.logit_scale, max=torch.log(torch.tensor(1. / 0.01))).exp()
modified: logit_scale = torch.clamp(self.logit_scale, max=torch.log(torch.tensor(1. / 0.01).to(self.logit_scale.device))).exp()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants