New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

是否支持长文本的识别及空格识别 #25

Open

wzg722 opened this issue Sep 12, 2024 · 1 comment

wzg722 commented Sep 12, 2024

是否支持长文本的识别及空格识别，我使用了large模型，图像尺寸会转到224*224，对于手写识别极差，长文本也识别不出来

Contributor

mzhaoshuai commented Sep 12, 2024

空格应该不支持。
长文本应该不太行，训练集里面文本基本都不长。使用224*224主要是为了保证CLIP预训练模型的完整性，改成其他尺寸实验的时候会在数据集上表现有所下降所以基本都是用了CLIP原来的patch划分方式和输入图像尺寸。可以尝试修改这些重新训练一下，performance也会不错。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment