You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The option "lora_post_process_for_vllm" does not seem to have any effect. It is described in https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/build/README.md#configuration as "If tuning for inference on vLLM, set lora_post_process_for_vllm to true. Post process LoRA adapters to allow inferencing on vLLM. vLLM needs new token embedding weights added during tuning to be moved to a new file new_embeddings.safetensors."
When fine tuning mistralai/Mixtral-8x7B-v0.1, "lora_post_process_for_vllm":true does not result in the creation of the new file new_embeddings.safetensors. Later, when the fine-tuned model is served by vllm, the following error occurs:
raise ValueError(f"{name} is unsupported LoRA weight")
ValueError: base_model.model.lm_head.weight is unsupported LoRA weight"
Platform
Interpreter version: Python 3.12.7
Library version: The version of the code from the main branch as of December 9 at 4:27 pm ET
When fine tuning mistralai/Mixtral-8x7B-v0.1, "lora_post_process_for_vllm":true does not result in the creation of the new file new_embeddings.safetensors. Later, when the fine-tuned model is served by vllm, the following error occurs:
raise ValueError(f"{name} is unsupported LoRA weight")
ValueError: base_model.model.lm_head.weight is unsupported LoRA weight"
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
Describe the bug
The option "lora_post_process_for_vllm" does not seem to have any effect. It is described in https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/build/README.md#configuration as "If tuning for inference on vLLM, set lora_post_process_for_vllm to true. Post process LoRA adapters to allow inferencing on vLLM. vLLM needs new token embedding weights added during tuning to be moved to a new file new_embeddings.safetensors."
When fine tuning mistralai/Mixtral-8x7B-v0.1, "lora_post_process_for_vllm":true does not result in the creation of the new file new_embeddings.safetensors. Later, when the fine-tuned model is served by vllm, the following error occurs:
Platform
Sample Code
where the content of config.json is as follows:
Expected behavior
The expected behavior is described in https://github.com/foundation-model-stack/fms-hf-tuning/blob/main/build/README.md#configuration: "If tuning for inference on vLLM, set lora_post_process_for_vllm to true. Post process LoRA adapters to allow inferencing on vLLM. vLLM needs new token embedding weights added during tuning to be moved to a new file new_embeddings.safetensors."
Observed behavior
When fine tuning mistralai/Mixtral-8x7B-v0.1, "lora_post_process_for_vllm":true does not result in the creation of the new file new_embeddings.safetensors. Later, when the fine-tuned model is served by vllm, the following error occurs:
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: