Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the decoding speed of the ELIC model #312

Open
YunuoChen opened this issue Oct 21, 2024 · 2 comments
Open

About the decoding speed of the ELIC model #312

YunuoChen opened this issue Oct 21, 2024 · 2 comments

Comments

@YunuoChen
Copy link

Thank you so much for your reimplementation of ELIC. However, in ELIC's original paper, they reported that the decoding latency of ELIC is much less than 100ms. But when I test with ELIC from CompressAI, the latency is about 130ms. May I ask why there is a speed gap?
Looking forward to your reply.

@YodaEmbedding
Copy link
Contributor

YodaEmbedding commented Oct 21, 2024

The measured decoding latency may depend on the CPU, GPU, or other conditions.

Their setup is mentioned here:

Supplementary Material

2. Detailed experimental settings

We implement, train, and evaluate all learning-based models on PyTorch 1.8.1. We use NVIDIA TITANXP to test both RD performance and inference speed. To test the speeds, we reproduce previously proposed models and evaluate them under the same running conditions for fair comparison. Since most of the models adopt reparameterization techniques, we fix the reparameterized weights before testing the speed. We follow a common protocol to test the latency with GPU synchronization. When testing each model, we drop the latency results (we do not drop them when evaluating RD performance) of the first 6 images to get rid of the influence of device warm-up, and average the running time of remained images to get the precious speed results.

We do not enable the deterministic inference mode (e.g. torch.backends.cudnn.deterministic) when testing the model speeds for two reasons. First, we tend to believe that the deterministic issue can be well solved with engineering efforts, such as using integer-only inference. Thus, the deterministic floating-point inference is unnecessary. Second, the deterministic mode extremely slows down the speed of specific operators, like transposed convolutions which are adopted by ELIC and earlier baseline models (Balle et al. and Minnen et al.), making the comparison somewhat unfair.

arXiv:2203.10886v2 [cs.CV] 29 Mar 2022

Relative measurements may be more worthwhile on different machines. Looking at their chart, it looks like:

  • ELIC is 20% faster than Minnen2020
  • ELIC is 300% slower than Balle2018
  • ELIC-small is 55% faster than Minnen2020
  • ELIC-small is 120% slower than Balle2018

@YodaEmbedding
Copy link
Contributor

YodaEmbedding commented Nov 15, 2024

Note

The above comment still applies.

I took another look at the paper, and it says:

The architecture of $g_{\text{ch}}$ network is frankly sketched from Minnen et al. (2020). [...] The parameter aggregation network linearly reduces the dimensions to $2 M^{k}$ .

Our current implementation uses a sequential_channel_ramp for $g_{\text{ch}}$. But the paper description seems to suggest:

        # In [He2022], this is labeled "g_ch^(k)".
        channel_context = {
            f"y{k}": nn.Sequential(
                conv(sum(self.groups[:k]), 224, kernel_size=5, stride=1),
                nn.ReLU(inplace=True),
                conv(224, 128, kernel_size=5, stride=1),
                nn.ReLU(inplace=True),
                conv(128, self.groups[k] * 2, kernel_size=5, stride=1),
            )
            for k in range(1, len(self.groups))
        }

However, our current param_aggregation seems to match the paper:

        # In [He2022], this is labeled "Param Aggregation".
        param_aggregation = [
            sequential_channel_ramp(
                # Input: spatial context, channel context, and hyper params.
                self.groups[k] * 2 + (k > 0) * self.groups[k] * 2 + N * 2,
                self.groups[k] * 2,
                min_ch=N * 2,
                num_layers=3,
                interp="linear",
                make_layer=nn.Conv2d,
                make_act=lambda: nn.ReLU(inplace=True),
                kernel_size=1,
                stride=1,
                padding=0,
            )
            for k in range(len(self.groups))
        ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants