-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the Emu Edit Benchmark metrics. #18
Comments
Or can you point out where I can find authoritative code for these calculation? That will be helpful. |
Hi, Thank you for bringing these issues to our attention. Versioning for Metrics CalculationWe've noticed that the original Emu Edit paper and dataset do not specify the versions of CLIP and DINO used. To align with other benchmarks, we adopted the settings used by MagicBrush (GitHub Repository). Specifically, the versions are Dataset Splits and InconsistenciesRegarding the dataset split issue: we utilized the test set of emu_edit_test_set for our evaluations. And due to mistakenly swapped dataset, our reported results were based on the validation set from the emu_edit_test_set_generations. Also, there are known issues with the benchmark quality as discussed in this discussion thread. Some image-caption pairs seem incorrect, like placeholder captions (e.g., 'a train station in city') or identical source and target captions. Evaluation CodeFor the metrics evaluation, we adhered closely to the MagicBrush evaluation script (GitHub Link) for both benchmarks with no major modifications. We plan to share our refined evaluation code soon; however, in the meantime, you can refer to the provided script in MagicBrush for immediate use. |
So in response to the Emu Edit benchmark issues, how did you apply this test set to your testing? In particular, can you tell me how the clip_dir was calculated? |
I appreciate your excellent work of Instruction-based Editing. Thanks for your efforts!
I have some questions for you about the Emu Edit Benchmark metrics.
The text was updated successfully, but these errors were encountered: