Skip to content

Commit

Permalink
docs: Add findings from exploration into model tuning performance deg…
Browse files Browse the repository at this point in the history
…radation (#315)

* docs: Add findings from exploration into model tuning performance degradation

Signed-off-by: Will Johnson <[email protected]>

* fix: More specifically refer to COS instead of just PVC

Signed-off-by: Will Johnson <[email protected]>

* docs: Change section name and remove numbers from README.md

Signed-off-by: Will Johnson <[email protected]>

---------

Signed-off-by: Will Johnson <[email protected]>
Signed-off-by: Anh Uong <[email protected]>
  • Loading branch information
willmj authored Aug 27, 2024
1 parent 2c56c30 commit 474e539
Showing 1 changed file with 7 additions and 0 deletions.
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,13 @@ generation_config.json model-00005-of-00006.safetensors tokenizer.model

</details>

#### Optimizing writing checkpoints
Writing models to Cloud Object Storage (COS) is an expensive operation. Saving model checkpoints to a local directory causes much faster training times than writing to COS. You can use `output_dir` and `save_model_dir` to control which type of storage you write your checkpoints and final model to.

You can set `output_dir` to a local directory and set `save_model_dir` to COS to save time on write operations while ensuring checkpoints are saved.

In order to achieve the fastest train time, set `save_strategy="no"`, as saving no checkpoints except for the final model will remove intermediate write operations all together.

## Tuning Techniques:

### LoRA Tuning Example
Expand Down

0 comments on commit 474e539

Please sign in to comment.