sp_token=32110 #28

xie-qiang · 2024-01-23T06:49:58Z

Hello, I found that the SP_TOKEN should be set to 32110 in Demo, otherwise the Image Token cannot be replaced, resulting in poor results, thank you!

outputs = model.generate(
        pixel_values = inputs['pixel_values'],
        input_ids = inputs['input_ids'],
        attention_mask = inputs['attention_mask'],
        img_mask = inputs['img_mask'],
        do_sample=False,
        max_length=50,
        min_length=1,
        set_min_padding_size =False,
        sp_token = 32110
)

The text was updated successfully, but these errors were encountered:

Jianzhao-Huang · 2024-05-08T17:55:14Z

Hello, I tried the method you mentioned, but encountered an error. Do you have any suggestions? Thank you very much!

Here is the complete error information.

shape mismatch leads to truncate. insert embedding tensor of shape torch.Size([96, 4096]) cannot be broadcast to replace placeholder of shape torch.Size([0, 4096])

{
	"name": "RuntimeError",
	"message": "torch.cat(): expected a non-empty list of Tensors",
	"stack": "---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[10], line 17
     14 inputs['pixel_values'] = inputs['pixel_values'].unsqueeze(0)
     16 inputs = inputs.to('cuda:0')
---> 17 outputs = model.generate(
     18         pixel_values = inputs['pixel_values'],
     19         input_ids = inputs['input_ids'],
     20         attention_mask = inputs['attention_mask'],
     21         img_mask = inputs['img_mask'],
     22         do_sample=False,
     23         max_length=50,
     24         min_length=1,
     25         set_min_padding_size =False,
     26         sp_token = 32110
     27 )
     28 generated_text = processor.batch_decode(outputs, skip_special_tokens=True)[0].strip()
     29 print(generated_text)

File ~/anaconda3/envs/mmicl/lib/python3.8/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    112 @functools.wraps(func)
    113 def decorate_context(*args, **kwargs):
    114     with ctx_factory():
--> 115         return func(*args, **kwargs)

File ~/hjz/harmful_meme_detection/mmicl/model/instructblip/modeling_instructblip.py:2129, in InstructBlipForConditionalGeneration.generate(self, pixel_values, qformer_input_ids, qformer_attention_mask, input_ids, attention_mask, img_mask, set_min_padding_size, sp_token, **generate_kwargs)
   2126         index+= i_count*img_token_szie
   2127     img_idx +=1
-> 2129 insert_embeds = torch.concat(insert_embeds_list, dim=0)
   2130 try:
   2131     inputs_embeds[image_embeds_index] = insert_embeds

RuntimeError: torch.cat(): expected a non-empty list of Tensors"
}

mhd0528 · 2024-10-05T00:26:57Z

Hi, did you solve the empty tensor issue? Thanks in advance!

Sunzz1996 · 2025-01-02T06:03:14Z

I met the same issue. Did you solve the empty tensor issue? Thanks in advance!
shape mismatch leads to truncate. insert embedding tensor of shape torch.Size([160, 4096]) cannot be broadcast to replace placeholder of shape torch.Size([0, 4096])

mhd0528 · 2025-01-02T14:50:35Z

Hi, the problem for me was that Huggingface updated their Instruction-Blip model to support images and videos. Basically, they add two new tokens for image and video.
With this change the sp_token for the customized model should be 32102 instead of 32100 before. I added the following code to get the correct special_token_id from the processor:

sp_token_id = processor.tokenizer.convert_tokens_to_ids(image_placeholder)
processor.tokenizer.img_place_token_id = sp_token_id
print(f"Special tokens id for '{image_placeholder}': {sp_token_id}. Add to processor.")

Hope this helps!

Sunzz1996 · 2025-01-03T02:12:29Z

Thank you so much for your warm help. I added your code in example.ipynb file, but it still doesn't work with the same error.

`# For T5 based model
from model.instructblip import InstructBlipConfig, InstructBlipModel, InstructBlipPreTrainedModel,InstructBlipForConditionalGeneration,InstructBlipProcessor
import datasets
import json
import transformers
from PIL import Image
import torch
model_type="instructblip"
model_ckpt="/data/llm-models/MMICL-Instructblip-T5-xxl"
processor_ckpt = "/data/llm-models/instructblip-flan-t5-xxl"
config = InstructBlipConfig.from_pretrained(model_ckpt )

if 'instructblip' in model_type:
model = InstructBlipForConditionalGeneration.from_pretrained(
model_ckpt,
config=config).to('cuda:6',dtype=torch.bfloat16)

image_palceholder="图"
sp = [image_palceholder]+[f"<image{i}>" for i in range(20)]
processor = InstructBlipProcessor.from_pretrained(
processor_ckpt
)

##modify the sp_token_id
sp_token_id = processor.tokenizer.convert_tokens_to_ids(image_palceholder)
processor.tokenizer.img_place_token_id = sp_token_id
print(f"Special tokens id for '{image_palceholder}': {sp_token_id}. Add to processor.")

sp = sp+processor.tokenizer.additional_special_tokens[len(sp):]
processor.tokenizer.add_special_tokens({'additional_special_tokens':sp})

if model.qformer.embeddings.word_embeddings.weight.shape[0] != len(processor.qformer_tokenizer):
model.qformer.resize_token_embeddings(len(processor.qformer_tokenizer))
replace_token="".join(32*[image_palceholder])`

mhd0528 · 2025-01-03T02:38:22Z

I remember another possible reason is the size of the image. But if you're using the example notebook that shouldn't be the issue here? I can't run the model now, but I'll share it here if it's not working for me again or I find another fix.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sp_token=32110 #28

sp_token=32110 #28

xie-qiang commented Jan 23, 2024

Jianzhao-Huang commented May 8, 2024

mhd0528 commented Oct 5, 2024

Sunzz1996 commented Jan 2, 2025

mhd0528 commented Jan 2, 2025 •

edited

Loading

Sunzz1996 commented Jan 3, 2025 •

edited

Loading

mhd0528 commented Jan 3, 2025

sp_token=32110 #28

sp_token=32110 #28

Comments

xie-qiang commented Jan 23, 2024

Jianzhao-Huang commented May 8, 2024

mhd0528 commented Oct 5, 2024

Sunzz1996 commented Jan 2, 2025

mhd0528 commented Jan 2, 2025 • edited Loading

Sunzz1996 commented Jan 3, 2025 • edited Loading

Thank you so much for your warm help. I added your code in example.ipynb file, but it still doesn't work with the same error.

mhd0528 commented Jan 3, 2025

mhd0528 commented Jan 2, 2025 •

edited

Loading

Sunzz1996 commented Jan 3, 2025 •

edited

Loading