You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Using the CheckerboardLatentCodec with a non-identity context_prediction module results in a runtime error during the forward pass. I believe this should only occur when using a torch version less than 2.0.
To Reproduce
Steps to reproduce the behavior:
Instantiate a CheckerboardLatentCodec.
Create any tensor and pass it to the forward() method of the latent codec.
Observe bug.
Minimal working example:
importtorchfromcompressai.latent_codecsimportCheckerboardLatentCodec, GaussianConditionalLatentCodecfromcompressai.layers.layersimportCheckerboardMaskedConv2dlc=CheckerboardLatentCodec(
latent_codec= {
"y": GaussianConditionalLatentCodec()
},
context_prediction=CheckerboardMaskedConv2d(4, 8, kernel_size=5, stride=1, padding=2)
)
t=torch.randn((1, 4, 64, 64)) # arbitrary shape, just must match channel size in context_prediction layerctx=torch.randn((1, 8, 16, 16)) # arbitrary shapeoutput=lc(t, ctx)
This code results in the error:
File ~/conda/miniconda3-ubuntu22/envs/sdv2-new/lib/python3.8/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
1126 # If we don't have any hooks, we want to skip the rest of the logic in
1127 # this function, and just call forward.
1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1129 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130 return forward_call(*input, **kwargs)
1131 # Do not call functions when jit is used
1132 full_backward_hooks, non_full_backward_hooks = [], []
File ~/software/compressai/compressai/latent_codecs/checkerboard.py:149, in CheckerboardLatentCodec.forward(self, y, side_params)
147 return self._forward_onepass(y, side_params)
148 if self.forward_method == "twopass":
--> 149 return self._forward_twopass(y, side_params)
150 if self.forward_method == "twopass_faster":
151 return self._forward_twopass_faster(y, side_params)
File ~/software/compressai/compressai/latent_codecs/checkerboard.py:192, in CheckerboardLatentCodec._forward_twopass(self, y, side_params)
187 B, C, H, W = y.shape
189 params = y.new_zeros((B, C * 2, H, W))
191 y_hat_anchors = self._forward_twopass_step(
--> 192 y, side_params, params, self._y_ctx_zero(y), "anchor"
193 )
195 y_hat_non_anchors = self._forward_twopass_step(
196 y, side_params, params, self.context_prediction(y_hat_anchors), "non_anchor"
197 )
199 y_hat = y_hat_anchors + y_hat_non_anchors
File ~/conda/miniconda3-ubuntu22/envs/sdv2-new/lib/python3.8/site-packages/torch/autograd/grad_mode.py:27, in _DecoratorContextManager.__call__.<locals>.decorate_context(*args, **kwargs)
24 @functools.wraps(func)
25 def decorate_context(*args, **kwargs):
26 with self.clone():
---> 27 return func(*args, **kwargs)
File ~/software/compressai/compressai/latent_codecs/checkerboard.py:272, in CheckerboardLatentCodec._y_ctx_zero(self, y)
269 @torch.no_grad()
270 def _y_ctx_zero(self, y: Tensor) -> Tensor:
271 """Create a zero tensor with correct shape for y_ctx."""
--> 272 y_ctx_meta = self.context_prediction(y.to("meta"))
273 return y.new_zeros(y_ctx_meta.shape)
File ~/conda/miniconda3-ubuntu22/envs/sdv2-new/lib/python3.8/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
1126 # If we don't have any hooks, we want to skip the rest of the logic in
1127 # this function, and just call forward.
1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1129 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130 return forward_call(*input, **kwargs)
1131 # Do not call functions when jit is used
1132 full_backward_hooks, non_full_backward_hooks = [], []
File ~/software/compressai/compressai/layers/layers.py:144, in MaskedConv2d.forward(self, x)
141 def forward(self, x: Tensor) -> Tensor:
142 # TODO(begaintj): weight assigment is not supported by torchscript
143 self.weight.data = self.weight.data * self.mask
--> 144 return super().forward(x)
File ~/conda/miniconda3-ubuntu22/envs/sdv2-new/lib/python3.8/site-packages/torch/nn/modules/conv.py:457, in Conv2d.forward(self, input)
456 def forward(self, input: Tensor) -> Tensor:
--> 457 return self._conv_forward(input, self.weight, self.bias)
File ~/conda/miniconda3-ubuntu22/envs/sdv2-new/lib/python3.8/site-packages/torch/nn/modules/conv.py:453, in Conv2d._conv_forward(self, input, weight, bias)
449 if self.padding_mode != 'zeros':
450 return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
451 weight, bias, self.stride,
452 _pair(0), self.dilation, self.groups)
--> 453 return F.conv2d(input, weight, bias, self.stride,
454 self.padding, self.dilation, self.groups)
NotImplementedError: convolution_overrideable not implemented. You are likely triggering this with tensor backend other than CPU/CUDA/MKLDNN, if this is intended, please use TORCH_LIBRARY_IMPL to override this function
- PyTorch / CompressAI Version: 1.21.1 / 1.2.6
- OS: Linux, Ubuntu 22.04.3
- How you installed PyTorch / CompressAI: source
- Build command you used (if compiling from source):
git clone https://github.com/InterDigitalInc/CompressAI compressai
cd compressai
pip install -U pip && pip install -e .
- Python version: 3.8.18
- CUDA/cuDNN version: 11.7
- GPU models and configuration: 1x NVIDIA GeForce RTX 3090
- Any other relevant information: N/A
Additional context
I am quite certain this is due to the fact that older pytorch versions do not support operations on tensors which are on the "meta" device. I think this was introduced with PyTorch 2.0 but I couldn't find anything definitive from a quick search.
I traced this back to commit eddb1bc, which uses meta device tensors to compute the expected size of the checkerboard context tensor. Replacing these lines with the previous version resolved the issue for me.
The text was updated successfully, but these errors were encountered:
To be fair I don't actually know which is the earliest torch version that supports meta device tensors as I couldn't find any solid information.
Although I think the simpler fix is probably good enough. On my machine with a 14900K and a 3090 and an (unreasonably large) context size of (16, 192, 512, 512) it takes 0.06ms to execute that line on GPU. It does take about 4 seconds on CPU, but with a more reasonable context size of (16, 192, 32, 32) it takes roughly 80ms on CPU.
Bug
Using the CheckerboardLatentCodec with a non-identity context_prediction module results in a runtime error during the forward pass. I believe this should only occur when using a torch version less than 2.0.
To Reproduce
Steps to reproduce the behavior:
CheckerboardLatentCodec
.forward()
method of the latent codec.Minimal working example:
This code results in the error:
Expected behavior
The code should not throw an error.
Environment
Output from
python3 -m torch.utils.collect_env
:Additional context
I am quite certain this is due to the fact that older pytorch versions do not support operations on tensors which are on the "meta" device. I think this was introduced with PyTorch 2.0 but I couldn't find anything definitive from a quick search.
I traced this back to commit eddb1bc, which uses meta device tensors to compute the expected size of the checkerboard context tensor. Replacing these lines with the previous version resolved the issue for me.
The text was updated successfully, but these errors were encountered: