Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why resnet backbones do not perform well on CenterNet? #512

Closed
Epiphqny opened this issue Dec 6, 2019 · 9 comments
Closed

Why resnet backbones do not perform well on CenterNet? #512

Epiphqny opened this issue Dec 6, 2019 · 9 comments
Labels
good first issue Good for newcomers

Comments

@Epiphqny
Copy link

Epiphqny commented Dec 6, 2019

The AP is even less than the much smaller DLA-34, do you have any idea?

@lbin
Copy link
Contributor

lbin commented Dec 6, 2019

which backbones do u use? Res50 or Res34?

@Epiphqny
Copy link
Author

Epiphqny commented Dec 6, 2019

which backbones do u use? Res50 or Res34?

As shown in the README of this project, the resnet-101 is about 3 points lower than dla-34 backbone : DLA-34(37.4), Resnet-101(34.6)

@lbin
Copy link
Contributor

lbin commented Dec 6, 2019

which backbones do u use? Res50 or Res34?

As shown in the README of this project, the resnet-101 is about 3 points lower than dla-34 backbone : DLA-34(37.4), Resnet-101(34.6)

May be DLA has more connections? This is strange and instesting, because on Imagenet, DLA's top1 accuracy is much lower than res50 or res101.

@Epiphqny
Copy link
Author

Epiphqny commented Dec 6, 2019

which backbones do u use? Res50 or Res34?

As shown in the README of this project, the resnet-101 is about 3 points lower than dla-34 backbone : DLA-34(37.4), Resnet-101(34.6)

May be DLA has more connections? This is strange and instesting, because on Imagenet, DLA's top1 accuracy is much lower than res50 or res101.

Yes, that's my doubt, i want to ask how much degree the CenterNet relies on the backbone.

@xingyizhou
Copy link
Owner

This is a very good question. We need to distinguish two concepts: "backbone" and "network". "network" is backbone + upsampling layers (or "neck" in some other papers). In our resnet_dcn models or msra_resnet models, the upsampling layers are light (3 upconv layers), while in our dla models, the upsampling layers are large (DLA_up + IDA_up, please refer to the code and fig.6 in the supplementary). I would say the upsampling layers matter a lot, and they are not easily transferable across backbones: the default DLA_up requires keeping the original channels in the backbone (for DLA, the #channels from 4x stride to 32x stride are 64, 128, 256, 512, for ResNets, they are 256, 512, 1024, 2048), and using IDA_up for ResNets will be very expensive. I can only afford to try Res18 + DLA_up, the performance is much better than Res18+dcn upconvs and close to DLA34+IDA_up on pascal. I haven't tried DLA + dcn upconvs.

@xingyizhou xingyizhou added the good first issue Good for newcomers label Dec 7, 2019
@Epiphqny
Copy link
Author

Epiphqny commented Dec 7, 2019

This is a very good question. We need to distinguish two concepts: "backbone" and "network". "network" is backbone + upsampling layers (or "neck" in some other papers). In our resnet_dcn models or msra_resnet models, the upsampling layers are light (3 upconv layers), while in our dla models, the upsampling layers are large (DLA_up + IDA_up, please refer to the code and fig.6 in the supplementary). I would say the upsampling layers matter a lot, and they are not easily transferable across backbones: the default DLA_up requires keeping the original channels in the backbone (for DLA, the #channels from 4x stride to 32x stride are 64, 128, 256, 512, for ResNets, they are 256, 512, 1024, 2048), and using IDA_up for ResNets will be very expensive. I can only afford to try Res18 + DLA_up, the performance is much better than Res18+dcn upconvs and close to DLA34+IDA_up on pascal. I haven't tried DLA + dcn upconvs.

Thanks for your detailed explanation. Can i understand it as that the keypoint estimation is more like semantic segmentation, which requires large resolution and dense prediction, therefore upsampling is critical to it?

@Epiphqny Epiphqny closed this as completed Dec 8, 2019
@niaoyu
Copy link

niaoyu commented Jan 31, 2020

This is a very good question. We need to distinguish two concepts: "backbone" and "network". "network" is backbone + upsampling layers (or "neck" in some other papers). In our resnet_dcn models or msra_resnet models, the upsampling layers are light (3 upconv layers), while in our dla models, the upsampling layers are large (DLA_up + IDA_up, please refer to the code and fig.6 in the supplementary). I would say the upsampling layers matter a lot, and they are not easily transferable across backbones: the default DLA_up requires keeping the original channels in the backbone (for DLA, the #channels from 4x stride to 32x stride are 64, 128, 256, 512, for ResNets, they are 256, 512, 1024, 2048), and using IDA_up for ResNets will be very expensive. I can only afford to try Res18 + DLA_up, the performance is much better than Res18+dcn upconvs and close to DLA34+IDA_up on pascal. I haven't tried DLA + dcn upconvs.

Thanks for your patience.

  1. You just mentioned the two structure "IDA_up" is more expensive than "DLA_up" so you choose to use "res + dla_up", but "DLA_up" is consturcted with some "IDA_up".
    Difference between IDA_up and DLA_up ucbdrive/dla#14 (comment)
    Why IDA_up consumes more?

  2. Just as you said, the output dimensions of "DLA_up" keep constant and the "IDA_up" will change all layers to the same output dimensions(upsample). So in your "res + dla_up", is it still need more layers after "DLA_up" and before "head layers"?Or is it just a mistake which should be " res + IDA_up"?

  3. I tried to make some improvements about resnet.
    3.1) resnet + fpn +dcn+ upsample + add
    3.2) resnet +ida_up (which actually is .. dcn+ upsample + add+dcn)
    Which one would be better?


不好意思,英文可能不太清晰。我好奇的是,

  1. 这里提到的"IDA_up"资源消耗比"DLAUp"大,但实际上"DLAUp"是由多个"IDA_up"组成的.
    Difference between IDA_up and DLA_up ucbdrive/dla#14 (comment)
    为什么反而IDA_up资源消耗大呢。

  2. "DLA_up"模块的各个层的输出纬度和输入一致保持不变的,在原始的版本中是需要"IDA_up"将所有层统一改为相关的维度-上采样-才能进行add操作。那后续提到的“res+dla_up”中,如果没有ida_up,dla_up后面连接的是什么结构,才能得到维度相同的featuremap(后续连上head)。或者说,写错了实际上使用的是“res+ida_up”?

  3. 这里想改进resnet效果。
    本想使用res+fpn,然后通过的upsample+add+dcn得到feature map。
    对比下,感觉可以采用res5+ida_up,就可以得到比较好的featuremap。
    哪种更合理些,有建议吗

@sisrfeng
Copy link

sisrfeng commented May 3, 2020

They are several variants outperforming ResNet. Why do some researchers still use the old ResNet rather than ResNext or SENet?
Is it because ResNet is GPU-friendly?
如何评价ResNeSt:Split-Attention Networks? - 知乎
https://www.zhihu.com/question/388637660/answer/1162087825

@menggui1993
Copy link

This is a very good question. We need to distinguish two concepts: "backbone" and "network". "network" is backbone + upsampling layers (or "neck" in some other papers). In our resnet_dcn models or msra_resnet models, the upsampling layers are light (3 upconv layers), while in our dla models, the upsampling layers are large (DLA_up + IDA_up, please refer to the code and fig.6 in the supplementary). I would say the upsampling layers matter a lot, and they are not easily transferable across backbones: the default DLA_up requires keeping the original channels in the backbone (for DLA, the #channels from 4x stride to 32x stride are 64, 128, 256, 512, for ResNets, they are 256, 512, 1024, 2048), and using IDA_up for ResNets will be very expensive. I can only afford to try Res18 + DLA_up, the performance is much better than Res18+dcn upconvs and close to DLA34+IDA_up on pascal. I haven't tried DLA + dcn upconvs.

Thanks for your patience.

  1. You just mentioned the two structure "IDA_up" is more expensive than "DLA_up" so you choose to use "res + dla_up", but "DLA_up" is consturcted with some "IDA_up".
    ucbdrive/dla#14 (comment)
    Why IDA_up consumes more?
  2. Just as you said, the output dimensions of "DLA_up" keep constant and the "IDA_up" will change all layers to the same output dimensions(upsample). So in your "res + dla_up", is it still need more layers after "DLA_up" and before "head layers"?Or is it just a mistake which should be " res + IDA_up"?
  3. I tried to make some improvements about resnet.
    3.1) resnet + fpn +dcn+ upsample + add
    3.2) resnet +ida_up (which actually is .. dcn+ upsample + add+dcn)
    Which one would be better?

不好意思,英文可能不太清晰。我好奇的是,

  1. 这里提到的"IDA_up"资源消耗比"DLAUp"大,但实际上"DLAUp"是由多个"IDA_up"组成的.
    ucbdrive/dla#14 (comment)
    为什么反而IDA_up资源消耗大呢。
  2. "DLA_up"模块的各个层的输出纬度和输入一致保持不变的,在原始的版本中是需要"IDA_up"将所有层统一改为相关的维度-上采样-才能进行add操作。那后续提到的“res+dla_up”中,如果没有ida_up,dla_up后面连接的是什么结构,才能得到维度相同的featuremap(后续连上head)。或者说,写错了实际上使用的是“res+ida_up”?
  3. 这里想改进resnet效果。
    本想使用res+fpn,然后通过的upsample+add+dcn得到feature map。
    对比下,感觉可以采用res5+ida_up,就可以得到比较好的featuremap。
    哪种更合理些,有建议吗

@niaoyu Hi, have you tried with res+fpn?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

6 participants