You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi all, I read the code and realized that the results were obtained from 3-shot demonstrations. However, some models were trained to follow instructions without demonstrations. These models may have better relative zero-shot performance (ranked higher in the result table) than current few-shot setting. It will be great if you can add these zero-shot evaluation results. Thank you.
The text was updated successfully, but these errors were encountered:
Good point, it may be worth investigating the zero-shot performance as well. We will try and add the zero-shot results for MMLU in the next few weeks. For now, we have reported the zero-shot results of HumanEval in the readme table (last column)
Hi all, I read the code and realized that the results were obtained from 3-shot demonstrations. However, some models were trained to follow instructions without demonstrations. These models may have better relative zero-shot performance (ranked higher in the result table) than current few-shot setting. It will be great if you can add these zero-shot evaluation results. Thank you.
The text was updated successfully, but these errors were encountered: