-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only has 0.44 accuracy on GSM8K after running the provided codes #13
Comments
Hi~ Thanks for your interest in our work! |
I use exactly the same dataset and train for a whole epoch, but no matter whether setting use_gt_labels as True/ False can not have the desired result. |
Hi~ Sorry for the confusion. This may result from a bug (i.e. forgetting to add Consistency_LLM/cllm/cllm_trainer_global.py Line 102 in 425691e
You can try running the updated code and the results should be normal now. Let me know if you have any other questions! |
But after this modification, the accuracy becomes 0.0. It seems that this modification is not correct. |
My bad😥... While we do use |
But it is strange that setting use_gt_labels=False still does not solve this problem. |
Hi @TrueNobility303. Thanks for your patience! We have identified the problems in the training script:
Please pull the code again. After applying the patches, training a model should be able to give you a much better performance as we have reported in the paper. |
Hi~Thank you for your reply. So after the modification, should I set use_gt_labels as True or not? |
Sorry for the earlier confusion. We have checked that the current version, with |
Thanks a lot! |
Dear authors,
I train the CLLM model on GSM8k with Abel-7B-001 as the teacher model, using the dataset
cleaned_gsm8k_jacobi
dataset you provided on huggingface, and run the train_cllm.sh, and set "use_gt_labels" in the filetrain_cllm_global.py
to be False according to this previous issue.The trained model only has an accuracy 0.44 after running bash eval/gsm8k/acc.sh, which is much lower than the result of the checkpoint provided by you.
Could you tell me what is wrong? What is the exact hyperparameter to reproduce the results?
I would greatly appreciate it if you could help me.
Best regards.
The text was updated successfully, but these errors were encountered: