-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about the gaussian_conditional model in Cheng2020 #316
Comments
The Discretized Gaussian Mixture Likelihoods follows the equation in the paper: CompressAI/compressai/entropy_models/entropy_models.py Lines 735 to 751 in 743680b
Usually, the parameters of the latent codec distribution, including the weights, are the outputs of some neural networks. You can slightly modify the network's output to obtain the weights. CompressAI/compressai/models/google.py Lines 534 to 554 in 743680b
|
Thanks for your kindly instruction. I tried a modification with the issue in here. I have tried to make the same modification of using the `
` I compared the The GMM is inferior to the anchor and I am unable to undertand it. Thanks again for your valuable time. Best wishes. |
Hi! I have the same problem as well. Have you found a good solution yet? |
Perhaps try using STE for quantization instead of noise. Still, it's weird that GMM K=3 performs that much worse than GC. Try setting K=1 and training. Is the performance still worse? |
Thanks for your kindly reply. I will try k=1 / STE and train to see the results. But yes, it's weired that there is no STE for quantization in original cheng's GMM paper. |
Hi, I read cheng's literature carefully again, and found that the first two output channels of the entropy_parameters in its network structure are both 640,but I see that the number of channels of the second layer convolution output in the entropy_parameters code you rewrote is 512, is it possible that the problem is here? |
Thanks for your advice@watwwwww .
The implementaion in paper N is set as 128 for two lower-rate models, and is set as 192 for the two higher-rate models. Actually, I noticed that the
I reference this implementation. https://github.com/leelitian/cheng2020-GMM/blob/main/model.py Intuitively, expand the channel representaion seems to gain some improvement. Here are my results.The lambda is 0.015 GMM with revised entropy_parameter (K=3) ========================================================= If you have any thoughts on the question, please feel free to share with me. |
Can I ask you about the dataset you used for training and the specific parameter settings of the training code, because I had a very long training process and no results in the process of validating my idea. |
In original paper, the author used 13830 samples from ImageNet. I just extract 14k images from coco dataset for traning, and DIV2k 800 image for test. Maybe the amount of dataset plays a role, but I compare different schemes with the same dataset. It's ok for you to use another dataset. I am not sure about what settings you asked. The training script is |
sry, the 16h is for 300 epochs. To shorter it, I just train 200 epochs. My aim is to validate the effectiveness of GMM, so the baseline is cheng2020-anchor without simplified attention. |
When I run the Cheng2020 series, I noticed that the Cheng2020Anchor inherit from JointAutoregressiveHierarchicalPriors.
While the `gaussian_conditional` model in `JointAutoregressiveHierarchicalPriors` is just GaussianConditional, it is not a gaussian mixture model.
I tried to add a sentence in Cheng2020Anchor
self.gaussian_conditional = GaussianMixtureConditional()
But it failed to run.
what does the weights mean? and what should i pass to the GaussianMixtureConditional()
The text was updated successfully, but these errors were encountered: