Intermediate structure XCLIP used for Recognition, how to retrieval #22

Lucky-Light-Sun · 2024-03-20T07:45:04Z

Hi,
I notice Intermediate structure XCLIP is used for RECOGNITION task and the official code is not used for retrieval task. So I want to ask how do you get the X-CLIP retrieval@1 metric? If you do the experiment by yourself, can you please give me the code? Or please give the refering paper and code.

Looking forward to your reply.

Best wishes!

The text was updated successfully, but these errors were encountered:

farewellthree · 2024-03-21T06:42:16Z

The retrieval code for XCLIP is held by my previous company, but I have been away for a long time, making it difficult to access these codes. Additionally, the past code was based on MMCV1.0 and is incompatible with the current version. However, replicating it is simple. We did not utilize XCLIP's prompting and MIT modules, while only using the CCT module that inserts message tokens into the backbone. We only need to make slight modifications to the VIT block of CLIP, see the CrossFramelAttentionBlock here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intermediate structure XCLIP used for Recognition, how to retrieval #22

Intermediate structure XCLIP used for Recognition, how to retrieval #22

Lucky-Light-Sun commented Mar 20, 2024

farewellthree commented Mar 21, 2024

Intermediate structure XCLIP used for Recognition, how to retrieval #22

Intermediate structure XCLIP used for Recognition, how to retrieval #22

Comments

Lucky-Light-Sun commented Mar 20, 2024

farewellthree commented Mar 21, 2024