You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently discovered that Llama1 was pretrained using fp16, but the llama2 family of models were pretrained with bf16. The readme in this repo has fp16 set as default. Switching to bf16 fixed this.
I followed all the setup instruction given in the README.
The command I am using is:
Initially, I got the following error:
I downgraded to transformers version 4.29.2 as suggested here.
Now, training is happening but the learning rate from the beginning itself is fixed to zero. Below are the logs:
Does anyone have any idea on what I might be doing wrong ?
The text was updated successfully, but these errors were encountered: