The tokenizer can not be loaded. #36

wkfdb · 2024-12-05T02:16:47Z

I create the environment following

conda create -n fastv-hf python=3.10
conda activate fastv-hf
cd ./src/FastV/llava-hf/transformers
pip install -e .
pip install pillow torch accelerate

and download the "llava-hf/llava-1.5-13b-hf" from huggingface, and change the model_id in demo_hf.py to my saving path.

The model loaded successfully, however when loading the tokenzier, the following error occurs:

Traceback (most recent call last):
File "/mnt/data/group/wangkuo/FastV-main/demo-hf.py", line 31, in
processor = AutoProcessor.from_pretrained(model_id) # always fails at this line
File "/mnt/data/group/wangkuo/FastV-main/src/FastV/llava-hf/transformers/src/transformers/models/auto/processing_auto.py", line 312, in from_pretrained
return processor_class.from_pretrained(
File "/mnt/data/group/wangkuo/FastV-main/src/FastV/llava-hf/transformers/src/transformers/processing_utils.py", line 465, in from_pretrained
args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
File "/mnt/data/group/wangkuo/FastV-main/src/FastV/llava-hf/transformers/src/transformers/processing_utils.py", line 511, in _get_arguments_from_pretrained
args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
File "/mnt/data/group/wangkuo/FastV-main/src/FastV/llava-hf/transformers/src/transformers/tokenization_utils_base.py", line 2086, in from_pretrained
return cls._from_pretrained(
File "/mnt/data/group/wangkuo/FastV-main/src/FastV/llava-hf/transformers/src/transformers/tokenization_utils_base.py", line 2325, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File "/mnt/data/group/wangkuo/FastV-main/src/FastV/llava-hf/transformers/src/transformers/models/llama/tokenization_llama_fast.py", line 133, in init
super().init(
File "/mnt/data/group/wangkuo/FastV-main/src/FastV/llava-hf/transformers/src/transformers/tokenization_utils_fast.py", line 111, in init
fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file)
Exception: data did not match any variant of untagged enum ModelWrapper at line 277156 column 3

Any clues for solving this problem?

wkfdb · 2024-12-05T02:36:03Z

It seems that the llava-hf needs tokenizers>=0.20, and no longer supports the older versions. The transformers version in your src code is 4.39.0.dev0 and it need tokenizers>=0.14, <0.19, which is conflict with the above llava-hf requirement.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The tokenizer can not be loaded. #36

The tokenizer can not be loaded. #36

wkfdb commented Dec 5, 2024

wkfdb commented Dec 5, 2024

The tokenizer can not be loaded. #36

The tokenizer can not be loaded. #36

Comments

wkfdb commented Dec 5, 2024

wkfdb commented Dec 5, 2024