lumi-nlp-recipes/sft_trl at main · TurkuNLP/lumi-nlp-recipes

History

Name		Name	Last commit message	Last commit date
parent directory ..
configs		configs
data		data
slurm_scripts		slurm_scripts
training		training
README.md		README.md

README.md

Finetuning with TRL

Structure

slurm_scripts For setting up your python virtual enviroment and launching slurm jobs.

training scripts for training and utility stuff.

configs Yaml files for the accelerate configurations and training arguments

Getting started

Go to slurm_scripts and modify it according to your own paths.

sbatch slurm_scripts/setup_venv

Dataset

Either use your own dataset and convert it to chatml's conversational format. You can get inspiration from data.

Format

{"messages": [{"role": "system", "content": "You are..."}, {"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}
{"messages": [{"role": "system", "content": "You are..."}, {"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}
{"messages": [{"role": "system", "content": "You are..."}, {"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}

or use some of the ready-made datasets in /scratch/project_462000558/TurkuNLP_workshop/data

Training

Go to configs if you want to change training arguments, as you would with huggingface TrainingArguments

Modify the launch script at slurm_scripts

sbatch slurm_scripts/sft.sh

Full model weight training on a 34B model requires minimum of 2 nodes and atleast 3 nodes is required. Also note that as you increase the amount of nodes, the training becomes more unstable and prone to nccl crashes/hangs.

Useful links

This work was heavily inspired by:

My own fork of the alignment handbook is https://github.com/Vmjkom/alignment-handbook (wip). The Alignment handbook implements reinforcement learning techniques, along with the Sft. Furthermore, there is more sophisticated data handling

Documentation

SftTrainer
Chat templating
Single and Multi-node Launchers with SLURM
PEFT (Parameter Efficient finetuning -- Lora...etc)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sft_trl

sft_trl

README.md

Finetuning with TRL

Structure

Getting started

Dataset

Format

Training

Useful links

Documentation

Files

sft_trl

Directory actions

More options

Directory actions

More options

Latest commit

History

sft_trl

Folders and files

parent directory

README.md

Finetuning with TRL

Structure

Getting started

Dataset

Format

Training

Useful links

Documentation