You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the correctness file, the composite data usually includes only 3 or 4 captions rated by humans per image. Some candidate captions look like a paragraph, do I need to truncate it into a short sentence? One example as follows: person is pulling bow in the back.A person might be wearing helmet in the scene.person is having tattoo.The scene contains grass and well-maintained grass and garden and playhouse.
In your paper, the correlation scores are computed with groundtruth references removed. Could you give me some guidance about how to process the composite dataset and reproduce the scores on Composite dataset?
The text was updated successfully, but these errors were encountered:
I want to reproduce the results on Composite dataset
person is pulling bow in the back.A person might be wearing helmet in the scene.person is having tattoo.The scene contains grass and well-maintained grass and garden and playhouse.
The text was updated successfully, but these errors were encountered: