You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I notice Intermediate structure XCLIP is used for RECOGNITION task and the official code is not used for retrieval task. So I want to ask how do you get the X-CLIP retrieval@1 metric? If you do the experiment by yourself, can you please give me the code? Or please give the refering paper and code.
Looking forward to your reply.
Best wishes!
The text was updated successfully, but these errors were encountered:
The retrieval code for XCLIP is held by my previous company, but I have been away for a long time, making it difficult to access these codes. Additionally, the past code was based on MMCV1.0 and is incompatible with the current version. However, replicating it is simple. We did not utilize XCLIP's prompting and MIT modules, while only using the CCT module that inserts message tokens into the backbone. We only need to make slight modifications to the VIT block of CLIP, see the CrossFramelAttentionBlock here.
Hi,
I notice Intermediate structure
XCLIP
is used for RECOGNITION task and the official code is not used for retrieval task. So I want to ask how do you get the X-CLIP retrieval@1 metric? If you do the experiment by yourself, can you please give me the code? Or please give the refering paper and code.Looking forward to your reply.
Best wishes!
The text was updated successfully, but these errors were encountered: