You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I try to run the train.py on ffhq dataset in multi-gpu manner. However, I meet the RuntimeError as follows
training_process(rank, world_size, opt, device)
File "/home/xintian/workspace/GRAM-main/training_loop.py", line 217, in training_process
d_loss = process.train_D(real_imgs, real_poses, generator_ddp, discriminator_ddp, optimizer_D, scaler, config, device)
File "/home/xintian/workspace/GRAM-main/processes/processes.py", line 38, in train_D
g_imgs, g_pos = generator_ddp(subset_z, **config['camera'])
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 705, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/workspace/GRAM-main/generators/generators.py", line 112, in forward
img, _ = self.renderer.render(self._intersections, self._volume(z, truncation_psi), img_size, camera_origin, camera_pos, fov, ray_start, ray_end, z.device)
File "/home/xintian/workspace/GRAM-main/generators/renderers/manifold_renderer.py", line 194, in render
coarse_output = volume(transformed_points, transformed_ray_directions_expanded).reshape(batchsize, img_size * img_size, self.num_manifolds, 4)
File "/home/xintian/workspace/GRAM-main/generators/generators.py", line 76, in
return lambda points, ray_directions: self.representation.get_radiance(z, points, ray_directions, truncation_psi)
File "/home/xintian/workspace/GRAM-main/generators/representations/gram.py", line 317, in get_radiance
return self.rf_network(x, z, ray_directions, truncation_psi)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/workspace/GRAM-main/generators/representations/gram.py", line 260, in forward
frequencies_2, phase_shifts_2 = self.mapping_network(z2)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/workspace/GRAM-main/generators/representations/gram.py", line 93, in forward
frequencies_offsets = self.network(z)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 94, in forward
return F.linear(input, self.weight, self.bias)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/functional.py", line 1753, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: Expected tensor for 'out' to have the same device as tensor for argument #2 'mat1'; but device 1 does not equal 0 (while checking arguments for addmm)
How can I fix this? By the way, when I run the train.py in single-gpu manner, it works.
The text was updated successfully, but these errors were encountered:
Hi, I try to run the train.py on ffhq dataset in multi-gpu manner. However, I meet the RuntimeError as follows
training_process(rank, world_size, opt, device)
File "/home/xintian/workspace/GRAM-main/training_loop.py", line 217, in training_process
d_loss = process.train_D(real_imgs, real_poses, generator_ddp, discriminator_ddp, optimizer_D, scaler, config, device)
File "/home/xintian/workspace/GRAM-main/processes/processes.py", line 38, in train_D
g_imgs, g_pos = generator_ddp(subset_z, **config['camera'])
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/parallel/distributed.py", line 705, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/workspace/GRAM-main/generators/generators.py", line 112, in forward
img, _ = self.renderer.render(self._intersections, self._volume(z, truncation_psi), img_size, camera_origin, camera_pos, fov, ray_start, ray_end, z.device)
File "/home/xintian/workspace/GRAM-main/generators/renderers/manifold_renderer.py", line 194, in render
coarse_output = volume(transformed_points, transformed_ray_directions_expanded).reshape(batchsize, img_size * img_size, self.num_manifolds, 4)
File "/home/xintian/workspace/GRAM-main/generators/generators.py", line 76, in
return lambda points, ray_directions: self.representation.get_radiance(z, points, ray_directions, truncation_psi)
File "/home/xintian/workspace/GRAM-main/generators/representations/gram.py", line 317, in get_radiance
return self.rf_network(x, z, ray_directions, truncation_psi)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/workspace/GRAM-main/generators/representations/gram.py", line 260, in forward
frequencies_2, phase_shifts_2 = self.mapping_network(z2)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/workspace/GRAM-main/generators/representations/gram.py", line 93, in forward
frequencies_offsets = self.network(z)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/modules/linear.py", line 94, in forward
return F.linear(input, self.weight, self.bias)
File "/home/xintian/anaconda3/envs/torch18/lib/python3.6/site-packages/torch/nn/functional.py", line 1753, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: Expected tensor for 'out' to have the same device as tensor for argument #2 'mat1'; but device 1 does not equal 0 (while checking arguments for addmm)
How can I fix this? By the way, when I run the train.py in single-gpu manner, it works.
The text was updated successfully, but these errors were encountered: