Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QLoRA Inference #1020

Open
jeff52415 opened this issue May 25, 2024 · 1 comment
Open

QLoRA Inference #1020

jeff52415 opened this issue May 25, 2024 · 1 comment
Assignees

Comments

@jeff52415
Copy link

Can I load QLoRA fine-tuning weights into a Hugging Face model as shown below?

model_id = "meta-llama/Meta-Llama-3-8B-Instruct"
quantization_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type='nf4'
)

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    pretrained_model_name_or_path=model_id,  
    #config=config,
    low_cpu_mem_usage=True,
    torch_dtype=torch.float16,
    device_map='auto'
)

model = PeftModel.from_pretrained(model, "qlora_finetune_folder/")

I have changed the Checkpointer to FullModelHFCheckpointer.
Essentially, it is loadable & runnable, but I am curious if it reflects the same structure as qlora_llama3_8b. Thanks.

@ebsmothers
Copy link
Contributor

Hi @jeff52415 thanks for opening this issue, this is a really good question. One possible source of discrepancy is the different implementations of NF4 quantization used by torchtune and Hugging Face. To be more explicit, torchtune relies on the NF4Tensor class from torchao in QLoRA instead of the bitsandbytes version from Hugging Face. I need to verify that quantizing a torchtune checkpoint with bitsandbytes yields the same result as quantizing with ao. Let me look into it and get back to you. Also cc @rohan-varma who may have some insights here

@joecummings joecummings added triage review This issue should be discussed in weekly review and removed triage review This issue should be discussed in weekly review high-priority labels Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants