Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FP8 support #171

Open
markoarnauto opened this issue Jul 24, 2024 · 3 comments
Open

FP8 support #171

markoarnauto opened this issue Jul 24, 2024 · 3 comments

Comments

@markoarnauto
Copy link

No description provided.

@btakeya
Copy link

btakeya commented Dec 2, 2024

Hi coreweave team, could you please give any consideration on it? I've just faced unknown quantization type error as below:

ValueError: Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet']

It'd be helped a lot if fp8 would be supported. Thanks in advance.

@sangstar
Copy link
Contributor

sangstar commented Dec 3, 2024

Hi @btakeya

Where did you get that error from? Can you send the full traceback? This looks like it might be an error from transformers.

@btakeya
Copy link

btakeya commented Dec 3, 2024

@sangstar Thanks for your attention!
I just tried with this code -- used my private fp8-quantized model (changed L16 with its hf repo name)
attached full traceback as well:

Traceback (most recent call last):
  File "tensorize.py", line 40, in <module>
    model = original_model(model_ref)
  File "tensorize.py", line 20, in original_model
    return AutoModelForCausalLM.from_pretrained(ref)
  File "/home/juhwan/venv/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 564, in from_pretrained
    return model_class.from_pretrained(
  File "/home/juhwan/venv/lib/python3.8/site-packages/transformers/modeling_utils.py", line 3647, in from_pretrained
    config.quantization_config = AutoHfQuantizer.merge_quantization_configs(
  File "/home/juhwan/venv/lib/python3.8/site-packages/transformers/quantizers/auto.py", line 173, in merge_quantization_configs
    quantization_config = AutoQuantizationConfig.from_dict(quantization_config)
  File "/home/juhwan/venv/lib/python3.8/site-packages/transformers/quantizers/auto.py", line 97, in from_dict
    raise ValueError(
ValueError: Unknown quantization type, got fp8 - supported types are: ['awq', 'bitsandbytes_4bit', 'bitsandbytes_8bit', 'gptq', 'aqlm', 'quanto', 'eetq', 'hqq', 'compressed-tensors', 'fbgemm_fp8', 'torchao', 'bitnet']

(seems transformer as you've expected)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants