You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
if load_in_4bit or load_in_8bit:
if quantization_config is not None:
raise ValueError(
...
)
config_dict = {k: v for k, v in kwargs.items() if k in inspect.signature(BitsAndBytesConfig).parameters}
config_dict = {**config_dict, "load_in_4bit": load_in_4bit, "load_in_8bit": load_in_8bit}
quantization_config, kwargs = BitsAndBytesConfig.from_dict(
config_dict=config_dict, return_unused_kwargs=True, **kwargs
)
FP4 quantization state not initialized. Please call .cuda() or .to(device) on the LinearFP4 layer first.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
...
in forward
x = self.linear(x)
File "myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "myenv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "myenv/lib/python3.11/site-packages/bitsandbytes/nn/modules.py", line 256, in forward
out = bnb.matmul_4bit(x, self.weight.t(), bias=bias, quant_state=self.weight.quant_state)
File "myenv/lib/python3.11/site-packages/bitsandbytes/autograd/_functions.py", line 566, in matmul_4bit
assert quant_state is not None
AssertionError
看了transformers的源码,我的环境里的transformers的版本是4.39.3。我在modeling_utils.py文件中并未发现调用
replace_with_bnb_linear
,我看到的代码是下面的而不是你在文章中写的使用
replace_with_bnb_linear
,如果方便的话,请解答一下这是因为版本问题还是我哪里搞错了?其次,我是想使用
replace_with_bnb_linear
在LLM之中替换一些线性层使用自己设计的结构,并且想使用4/8bit量化这些部分,跟load_in_4/8bit
保持一致。尽可能让替换后的模块,和替换前的模块的量化策略保持一致。但是直接使用replace_with_bnb_linear
之后,在forward计算中会有如下的报错因为没有找到transformers中使用
replace_with_bnb_linear
的地方是怎么写的,所以这里不知道quant_state应该是否要人工设定或者怎么给出。有劳大佬抽空看看,先谢了The text was updated successfully, but these errors were encountered: