-
Notifications
You must be signed in to change notification settings - Fork 536
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification regarding how the accuracy.txt file is generated #1861
Comments
@arjunsuresh Yes, that's correct. |
I can think of a situation when an implementer refactors/integrates a reference script into their own script. For example, the reference script may hardcode using |
@psyhtest yes, running the reference accuracy script standalone is fine I believe. But this is not that straightforward as it often requires the original dataset and so we do have some submissions where accuracy.txt is generated from the benchmark run itself without calling the reference script. We didn't see any accuracy issue when running the standalone script for those submissions, but I believe this should not be allowed. |
But you admit that in some cases it may not be straightforward:
So why would we disallow it in such cases? |
@psyhtest I'm not telling to disallow running the reference accuracy script in a custom way - say like within another python file. But I don't think it is right to allow generation of the accuracy.txt file by mimicking the actions of the reference script - because it becomes hard to verify this for other people. We face this issue specifically for automating DLRMv2 submissions where to generate the accuracy.txt file we need the day23 criteo dataset which is not possible to be downloaded in an non-interactive way. But if we are allowed to generate the accuracy.txt file from within the benchmark implementation we possibly do not need this file at all. |
The submission generation rules for inference says that the
accuracy.txt
file should be generated from the accuracy scripts. My interpretation of this is that one should run the reference accuracy scripts stand alone using the logs from the accuracy run and obtain this accuracy.txt file and not dump the accuracy.txt file with in the implementation code. Is this the correct interpretation?The text was updated successfully, but these errors were encountered: