Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IncompatibleKeys Error when load pre-trained model to do the fine-tuning. #14

Open
websitefingerprinting opened this issue Apr 23, 2024 · 4 comments

Comments

@websitefingerprinting
Copy link

Nice work and I have a question here:

I am trying to pre-train and finetune a model on my own datasets. However, some warnings were raised when loading the pre-trained model during finetuning:

loading pretrained model: checkpoints/ALL_task_UniTS_pretrain_x64_bs1024_UniTS_All_dm64_el3_Exp_0/pretrain_checkpoint.pth
_IncompatibleKeys(missing_keys=['category_tokens.CLS_dataset1', 'category_tokens.CLS_dataset2'], unexpected_keys=['pretrain_head.proj_in.weight', 'pretrain_head.proj_in.bias', 'pretrain_head.mlp.fc1.weight', 'pretrain_head.mlp.fc1.bias', 'pretrain_head.mlp.fc2.weight', 'pretrain_head.mlp.fc2.bias', 'pretrain_head.proj_out.weight', 'pretrain_head.proj_out.bias', 'pretrain_head.pos_proj.weights', 'pretrain_head.pos_proj.bias'])

Is everything correct here?

Thank you for your help!

@gasvn
Copy link
Member

gasvn commented Apr 23, 2024

It's fine, there is a extra head during pretraining, which is not used for finetuning.

@websitefingerprinting
Copy link
Author

Thank you for your prompt response. One more question if you have any idea about it:

Few-shot finetuning didn't yield good results for my classification task. I pre-trained and finetuned using my own datasets. With only 5% of the data, accuracy was very low, but it improved with 100% data. However, pretraining didn't offer any advantage compared to supervised training.

I append my pretrain train loss below.
pretrain

Does the pretrain loss look correct?

Many thanks! (It is fine if you have no idea, since I may make a mistake in this process or my problem may not fit in.)

@gasvn
Copy link
Member

gasvn commented Apr 23, 2024

I am not so sure. The loss looks pretty large. Have you tried only do the prompt tuning with the pretrained model? If it don't get a reasonable performance, it means the pretraining is not working well.

@websitefingerprinting
Copy link
Author

Thank you again for your suggestion.

My data is very different from the ones in your paper. My input is a sequence of unnormalized integers (i.e., $x_i \in N^{Length \times Dim}$). I guess that may be the reason why the reconstructed loss is large. So, I should normalize the data before input to the transformer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants