You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@HaozheZhao
Hi, HeozhaZhao, Thanks for your great job at MLLM. I cost a lot of time to run your code MIC. But I find some errors in your code.
First, just like bebow
I can't even find the function save_pred_label implement in dataset.py.
Besides, There two addiational problems.
When train with jsonl data format that download from MIC_full and set done_preprocess==False, your code does't work. It prompt IterableDataset hasn't len method.
When train with arrow data format I use data_preprocess.py to generate and use Flan-T5 as language model, I got dimension mismatch error. Becase T5's tokenizer.model_max_length is 512, The length of one sample with few-shot is much longer than 512 tokens. Your truncate the input_ids, resulting in image_placeholder (T5's image_placeholder is 图) after 512 tokens in input_ids also truncated. But image num in pixel value not truncated.
So, I think you didn't upload the latest code. Am I right? Or am I wrong about something.
Any suggestions will help me
The text was updated successfully, but these errors were encountered:
@HaozheZhao
Hi, HeozhaZhao, Thanks for your great job at MLLM. I cost a lot of time to run your code MIC. But I find some errors in your code.
First, just like bebow
I can't even find the function
save_pred_label
implement in dataset.py.Besides, There two addiational problems.
So, I think you didn't upload the latest code. Am I right? Or am I wrong about something.
Any suggestions will help me
The text was updated successfully, but these errors were encountered: