set unique UNK token #435

Ssukriti · 2025-01-09T04:20:54Z

Description of the change

Granite models have UNK = EOS , this is resulting in poor quality when tuning for some datasets. When it is set to unique, the quality improves.

Related issue number

https://github.ibm.com/ai-foundation/watson-fm-stack-tracker/issues/1435

How to verify the PR

Tested with tone dataset
/home/tuning/.local/bin/accelerate launch --num_processes=2 --config_file /app/accelerate_fsdp_defaults.yaml -m tuning.sft_trainer --model_name_or_path $MODEL_PATH --training_data_path $TRAIN_DATA_PATH --torch_dtype bfloat16 --output_dir $OUTPUT_PATH --num_train_epochs 5 --per_device_train_batch_size 4 --gradient_accumulation_steps 4 --learning_rate 1e-5 --response_template "\n### Response:" --dataset_text_field "output"

export MODEL_PATH="ibm-granite/granite-3.0-8b-base"
export TRAIN_DATA_PATH="/testing/tuning/input/cc_tone_sft_format_1000_train.json"

at inference we get repeated output without the change , and proper output after change

Was the PR tested

I have added >=1 unit test(s) for every new method I have added.
I have ensured all unit tests pass

Signed-off-by: Sukriti-Sharma4 <[email protected]>

github-actions · 2025-01-09T04:21:04Z

Thanks for making a pull request! 😃
One of the maintainers will review and advise on the next steps.

set unique UNK token

79c2684

Signed-off-by: Sukriti-Sharma4 <[email protected]>

Ssukriti requested review from anhuong, aluu317, fabianlim and kmehant as code owners January 9, 2025 04:20

Ssukriti marked this pull request as draft January 9, 2025 04:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

set unique UNK token #435

set unique UNK token #435

Ssukriti commented Jan 9, 2025

github-actions bot commented Jan 9, 2025

set unique UNK token #435

Are you sure you want to change the base?

set unique UNK token #435

Conversation

Ssukriti commented Jan 9, 2025

Description of the change

Related issue number

How to verify the PR

Was the PR tested

github-actions bot commented Jan 9, 2025