Skip to content

Fine‐tune LLM

Visanth Vijayan edited this page Apr 23, 2024 · 9 revisions

Initial Setup

This is needed only the first time

python -m pip install --upgrade pip
apt install python3.10-venv # Was needed inside colab container
python -m venv myenv
source myenv/bin/activate 
pip install -r requirements-all.txt
pip install huggingface_hub

Fine-tuning a specific model using 4-bit quantization.

source myenv/bin/activate 
MODEL="codellama/CodeLlama-7b-Instruct-hf"

python scripts/download.py --repo_id $MODEL
python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/$MODEL

python scripts/prepare_ui_gen_data.py --csv_path test_split.csv --destination_path $MODEL/data/csv --checkpoint_dir checkpoints/$MODEL --test_split_fraction 0 --seed 42 --mask_inputs false --ignore_index -1
* Rename the output file train.pt to tmp.pt
python scripts/prepare_ui_gen_data.py --csv_path train_split.csv --destination_path $MODEL/data/csv --checkpoint_dir checkpoints/$MODEL --test_split_fraction 0 --seed 42 --mask_inputs false --ignore_index -1
* Replace the output file test.pt with tmp.pt by renaming it to test.pt

nohup python finetune/lora.py --precision 32-true --quantize bnb.nf4  --io.checkpoint_dir checkpoints/$MODEL --io.out_dir output/$MODEL/code-gen-ui --io.train_data_dir $MODEL/data/csv --io.val_data_dir $MODEL/data/csv &
Clone this wiki locally