IncompatibleKeys Error when load pre-trained model to do the fine-tuning. #14

websitefingerprinting · 2024-04-23T06:31:29Z

Nice work and I have a question here:

I am trying to pre-train and finetune a model on my own datasets. However, some warnings were raised when loading the pre-trained model during finetuning:

loading pretrained model: checkpoints/ALL_task_UniTS_pretrain_x64_bs1024_UniTS_All_dm64_el3_Exp_0/pretrain_checkpoint.pth
_IncompatibleKeys(missing_keys=['category_tokens.CLS_dataset1', 'category_tokens.CLS_dataset2'], unexpected_keys=['pretrain_head.proj_in.weight', 'pretrain_head.proj_in.bias', 'pretrain_head.mlp.fc1.weight', 'pretrain_head.mlp.fc1.bias', 'pretrain_head.mlp.fc2.weight', 'pretrain_head.mlp.fc2.bias', 'pretrain_head.proj_out.weight', 'pretrain_head.proj_out.bias', 'pretrain_head.pos_proj.weights', 'pretrain_head.pos_proj.bias'])

Is everything correct here?

Thank you for your help!

The text was updated successfully, but these errors were encountered:

gasvn · 2024-04-23T15:18:48Z

It's fine, there is a extra head during pretraining, which is not used for finetuning.

websitefingerprinting · 2024-04-23T15:35:27Z

Thank you for your prompt response. One more question if you have any idea about it:

Few-shot finetuning didn't yield good results for my classification task. I pre-trained and finetuned using my own datasets. With only 5% of the data, accuracy was very low, but it improved with 100% data. However, pretraining didn't offer any advantage compared to supervised training.

I append my pretrain train loss below.

Does the pretrain loss look correct?

Many thanks! (It is fine if you have no idea, since I may make a mistake in this process or my problem may not fit in.)

gasvn · 2024-04-23T15:38:12Z

I am not so sure. The loss looks pretty large. Have you tried only do the prompt tuning with the pretrained model? If it don't get a reasonable performance, it means the pretraining is not working well.

websitefingerprinting · 2024-04-23T15:45:41Z

Thank you again for your suggestion.

My data is very different from the ones in your paper. My input is a sequence of unnormalized integers (i.e., $x_i \in N^{Length \times Dim}$). I guess that may be the reason why the reconstructed loss is large. So, I should normalize the data before input to the transformer?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IncompatibleKeys Error when load pre-trained model to do the fine-tuning. #14

IncompatibleKeys Error when load pre-trained model to do the fine-tuning. #14

websitefingerprinting commented Apr 23, 2024

gasvn commented Apr 23, 2024

websitefingerprinting commented Apr 23, 2024

gasvn commented Apr 23, 2024

websitefingerprinting commented Apr 23, 2024

IncompatibleKeys Error when load pre-trained model to do the fine-tuning. #14

IncompatibleKeys Error when load pre-trained model to do the fine-tuning. #14

Comments

websitefingerprinting commented Apr 23, 2024

gasvn commented Apr 23, 2024

websitefingerprinting commented Apr 23, 2024

gasvn commented Apr 23, 2024

websitefingerprinting commented Apr 23, 2024