Skip to content

Commit

Permalink
update configs to match parallel PR
Browse files Browse the repository at this point in the history
  • Loading branch information
Optimox committed Oct 27, 2024
1 parent 6b50916 commit 54a237c
Show file tree
Hide file tree
Showing 13 changed files with 37 additions and 19 deletions.
4 changes: 3 additions & 1 deletion recipes/configs/gemma2/27B_full.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -53,6 +54,7 @@ loss:
_component_: torchtune.modules.loss.CEWithChunkedOutputLoss
max_steps_per_epoch: null
gradient_accumulation_steps: 1
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -69,4 +71,4 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-27b-finetune
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True
4 changes: 3 additions & 1 deletion recipes/configs/gemma2/27B_lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -65,6 +66,7 @@ batch_size: 4
epochs: 3
max_steps_per_epoch: null
gradient_accumulation_steps: 1
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -81,4 +83,4 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-27b-lora
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True
5 changes: 3 additions & 2 deletions recipes/configs/gemma2/27B_lora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -64,7 +65,7 @@ batch_size: 2
epochs: 1
max_steps_per_epoch: null
gradient_accumulation_steps: 8
compile: False
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -82,7 +83,7 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-27b-lora
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True

# Show case the usage of pytorch profiler
# Set enabled to False as it's only needed for debugging training
Expand Down
5 changes: 3 additions & 2 deletions recipes/configs/gemma2/27B_qlora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -64,7 +65,7 @@ batch_size: 4
epochs: 3
max_steps_per_epoch: null
gradient_accumulation_steps: 4
compile: False
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -82,7 +83,7 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-27b-lora
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True

# Show case the usage of pytorch profiler
# Set enabled to False as it's only needed for debugging training
Expand Down
4 changes: 3 additions & 1 deletion recipes/configs/gemma2/2B_full.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -55,6 +56,7 @@ loss:
_component_: torchtune.modules.loss.CEWithChunkedOutputLoss
max_steps_per_epoch: null
gradient_accumulation_steps: 1
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -71,4 +73,4 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-finetune
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True
4 changes: 3 additions & 1 deletion recipes/configs/gemma2/2B_lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -67,6 +68,7 @@ batch_size: 4
epochs: 3
max_steps_per_epoch: null
gradient_accumulation_steps: 1
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -83,4 +85,4 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-lora
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True
5 changes: 3 additions & 2 deletions recipes/configs/gemma2/2B_lora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -66,7 +67,7 @@ batch_size: 8
epochs: 3
max_steps_per_epoch: null
gradient_accumulation_steps: 2
compile: False
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -84,7 +85,7 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-lora
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True

# Show case the usage of pytorch profiler
# Set enabled to False as it's only needed for debugging training
Expand Down
5 changes: 3 additions & 2 deletions recipes/configs/gemma2/2B_qlora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -66,7 +67,7 @@ batch_size: 4
epochs: 3
max_steps_per_epoch: null
gradient_accumulation_steps: 4
compile: False
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -84,7 +85,7 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-lora
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True

# Show case the usage of pytorch profiler
# Set enabled to False as it's only needed for debugging training
Expand Down
4 changes: 3 additions & 1 deletion recipes/configs/gemma2/9B_full.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -53,6 +54,7 @@ loss:
_component_: torchtune.modules.loss.CEWithChunkedOutputLoss
max_steps_per_epoch: null
gradient_accumulation_steps: 1
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -69,4 +71,4 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-9b-finetune
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True
4 changes: 3 additions & 1 deletion recipes/configs/gemma2/9B_lora.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -65,6 +66,7 @@ batch_size: 4
epochs: 3
max_steps_per_epoch: null
gradient_accumulation_steps: 1
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -81,4 +83,4 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-9b-lora
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True
5 changes: 3 additions & 2 deletions recipes/configs/gemma2/9B_lora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -64,7 +65,7 @@ batch_size: 8
epochs: 1
max_steps_per_epoch: null
gradient_accumulation_steps: 2
compile: False
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -82,7 +83,7 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-9b-lora
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True

# Show case the usage of pytorch profiler
# Set enabled to False as it's only needed for debugging training
Expand Down
5 changes: 3 additions & 2 deletions recipes/configs/gemma2/9B_qlora_single_device.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ tokenizer:

# Dataset
dataset:
packed: False # Set to true for great speed ups
_component_: torchtune.datasets.alpaca_dataset
seed: null
shuffle: True
Expand Down Expand Up @@ -64,7 +65,7 @@ batch_size: 4
epochs: 3
max_steps_per_epoch: null
gradient_accumulation_steps: 4
compile: False
compile: False # pytorch compile, set to true for perf/memory improvement

# Training env
device: cuda
Expand All @@ -82,7 +83,7 @@ metric_logger:
log_dir: ${output_dir}
output_dir: /tmp/alpaca-gemma2-9b-lora
log_every_n_steps: 1
log_peak_memory_stats: False
log_peak_memory_stats: True

# Show case the usage of pytorch profiler
# Set enabled to False as it's only needed for debugging training
Expand Down
2 changes: 1 addition & 1 deletion torchtune/training/checkpointing/_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ class ModelType(Enum):
Attributes:
GEMMA (str): Gemma family of models. See :func:`~torchtune.models.gemma.gemma`
GEMMA2 (str): Gemma family of models. See :func:`~torchtune.models.gemma2.gemma2`
GEMMA2 (str): Gemma 2 family of models. See :func:`~torchtune.models.gemma2.gemma2`
LLAMA2 (str): Llama2 family of models. See :func:`~torchtune.models.llama2.llama2`
LLAMA3 (str): Llama3 family of models. See :func:`~torchtune.models.llama3.llama3`
LLAMA3_2 (str): Llama3.2 family of models. See :func:`~torchtune.models.llama3_2.llama3_2`
Expand Down

0 comments on commit 54a237c

Please sign in to comment.