Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internlm-sft 训练loss一直为0 #178

Open
C-myu opened this issue Jun 22, 2024 · 0 comments
Open

internlm-sft 训练loss一直为0 #178

C-myu opened this issue Jun 22, 2024 · 0 comments

Comments

@C-myu
Copy link

C-myu commented Jun 22, 2024

CUDA_VISIBLE_DEVICES=0,1,2,3 train_sft.py
--deepspeed ds_zero2_no_offload.json
--model_name_or_path internlm-7b
--use_lora true
--use_deepspeed true
--data_path hz_sft_data_test
--bf16 true
--fp16 false
--output_dir output_refuse_test
--num_train_epochs 5
--per_device_train_batch_size 3
--per_device_eval_batch_size 1
--gradient_accumulation_steps 8
--evaluation_strategy "no"
--save_strategy "epoch"
--save_total_limit 3
--learning_rate 4e-4
--logging_steps 10
--tf32 False
--model_max_length 2048 之后发现训练的loss一直是0,是由于没采用deepspeed的原因吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant