Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: Add qLoRA README #322

Merged
merged 12 commits into from
Sep 13, 2024
Merged

Conversation

aluu317
Copy link
Collaborator

@aluu317 aluu317 commented Aug 30, 2024

Description of the change

We are now supporting qLoRA tuning technique, so this PR adds documentation on that.

Related issue number

How to verify the PR

Was the PR tested

  • I have added >=1 unit test(s) for every new method I have added.
  • I have ensured all unit tests pass

Copy link
Collaborator

@anhuong anhuong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting the docs started Angel! Would be good to clarify what the flags we can pass are and what is currently supported.

README.md Outdated
@@ -9,6 +9,7 @@
- [Tips on Parameters to Set](#tips-on-parameters-to-set)
- [Tuning Techniques](#tuning-techniques)
- [LoRA Tuning Example](#lora-tuning-example)
- [qLoRA Tuning Example](#qlora-tuning-example)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is technically QLoRA

Suggested change
- [qLoRA Tuning Example](#qlora-tuning-example)
- [QLoRA Tuning Example](#qlora-tuning-example)

README.md Outdated
@@ -432,6 +433,79 @@ Example 3:

_________________________


### qLoRA Tuning Example
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same nit:

Suggested change
### qLoRA Tuning Example
### QLoRA Tuning Example

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually think we want to be more specific here. If these docs are just for auto_gptq and not bnb_qlora then we should note that this is 4bit GPTQ-LoRA with AutoGPTQ. Whereas bnb_qlora is 4bit QLoRA with bitsandbytes as both are named QLoRA

README.md Outdated

### qLoRA Tuning Example

This method is similar to LoRA Tuning, but the base model is a quantized model.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be useful to add a note pointing to the fms-acceleration section and note that is how qlora is enabled - https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/README.md#fms-acceleration

README.md Outdated
### qLoRA Tuning Example

This method is similar to LoRA Tuning, but the base model is a quantized model.
Set `peft_method` to `"lora"`. You can pass any of LoraConfig, see section on [LoRA Example](#lora-tuning-example).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to note the additional qlora flags first and then note that the lora params are the same. Would be useful to have more details on what the LoRA quantization config flags mean and what kernels are supported. Right now only triton is supported

README.md Outdated
Comment on lines 450 to 460
```py
class AutoGPTQLoraConfig:

# auto_gptq supports various kernels, to select the kernel to use.
kernel: str = "triton_v2"

# allow auto_gptq to quantize a model before training commences.
# NOTE: currently this is not allowed.
from_quantized: bool = True

```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not include this as it causes more confusion. Right now we only support triton kernel and from_quantized has the note that this is currently not allowed so we should not expose it to users

}
```

Similarly to LoRA, the `target_modules` are the names of the modules to apply the adapter to. See the LoRA [section](#lora-tuning-example) on `target_modules` for more info.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This link doesn't take me to the target modules section. It takes me to the FMS Acceleration section

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I see you have the Lora-tuning-example link and note the target modules section so that works

Copy link
Collaborator

@anhuong anhuong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Angel! although i see the DCO failed on your latest commit not being signed. I think this can also be brought out of review phase

@aluu317 aluu317 marked this pull request as ready for review September 13, 2024 16:57
aluu317 and others added 11 commits September 13, 2024 11:02
Signed-off-by: Angel Luu <[email protected]>
…o breaking change in HF SFTTrainer (foundation-model-stack#326)

* fix: need to pass skip_prepare_dataset for pretokenized dataset due to breaking change in HF SFTTrainer

Signed-off-by: Harikrishnan Balagopal <[email protected]>

* fix: wrong dataset paths, was using non-tokenized data in pre-tokenized dataset tests

Signed-off-by: Harikrishnan Balagopal <[email protected]>

---------

Signed-off-by: Harikrishnan Balagopal <[email protected]>
Signed-off-by: Angel Luu <[email protected]>
…k#284)

* add fms-acceleration deps and pytorh layer with cuda

Signed-off-by: Anh-Uong <[email protected]>

* add build args needed

Signed-off-by: Anh-Uong <[email protected]>

* allow transformers v4.40 for fms-acceleration

Signed-off-by: Anh-Uong <[email protected]>

* set wider transformers version

Signed-off-by: Anh-Uong <[email protected]>

* remove nvidia stage

Signed-off-by: Anh-Uong <[email protected]>

* add gcc and dev tools

Signed-off-by: Anh-Uong <[email protected]>

* install c compiler and python deps

Signed-off-by: Anh-Uong <[email protected]>

* remove transformers lower bound and dev deps

Signed-off-by: Anh-Uong <[email protected]>

* install python-devel by version

Signed-off-by: Anh Uong <[email protected]>

* update python installations

Signed-off-by: Anh Uong <[email protected]>

---------

Signed-off-by: Anh-Uong <[email protected]>
Signed-off-by: Anh Uong <[email protected]>
Signed-off-by: Angel Luu <[email protected]>
Signed-off-by: Angel Luu <[email protected]>
…odel-stack#309)

* fix: Migrate tranformer logging to python logging

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Migrate tranformer logging to python logging

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Removed unwanted file

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Log levels obtained from reversing the dictionary

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Format issues

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Variable names made meaningful

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Removed unwanted log line

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Added name to getLogger

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Added default logging level to DEBUG

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Added default logging level to DEBUG

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Added default logging level to DEBUG

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Removed setLevel() calls from the packages

Signed-off-by: Padmanabha V Seshadri <[email protected]>

* fix: Format issues resolved

Signed-off-by: Padmanabha V Seshadri <[email protected]>

---------

Signed-off-by: Padmanabha V Seshadri <[email protected]>
Signed-off-by: Angel Luu <[email protected]>
Signed-off-by: Mehant Kammakomati <[email protected]>
Signed-off-by: Harikrishnan Balagopal <[email protected]>
Signed-off-by: Anh Uong <[email protected]>
Co-authored-by: Mehant Kammakomati <[email protected]>
Signed-off-by: Angel Luu <[email protected]>
- FSDP bug in accelerate v0.34

Signed-off-by: Anh Uong <[email protected]>
Signed-off-by: Angel Luu <[email protected]>
Signed-off-by: Angel Luu <[email protected]>
Signed-off-by: Anh Uong <[email protected]>
Signed-off-by: Angel Luu <[email protected]>
* fix: Removal of lm head hack

Signed-off-by: Abhishek <[email protected]>

* set fms_accelerate to true by default

Signed-off-by: Anh Uong <[email protected]>

---------

Signed-off-by: Abhishek <[email protected]>
Signed-off-by: Anh Uong <[email protected]>
Co-authored-by: Anh Uong <[email protected]>
Signed-off-by: Angel Luu <[email protected]>
@aluu317 aluu317 merged commit 1fe73b4 into foundation-model-stack:main Sep 13, 2024
7 checks passed
@aluu317 aluu317 deleted the qLora_README branch September 13, 2024 17:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants