Skip to content

Latest commit

 

History

History
20 lines (14 loc) · 655 Bytes

220811 Semi-supervised Vision Transformers at Scale.md

File metadata and controls

20 lines (14 loc) · 655 Bytes

https://arxiv.org/abs/2208.05688

Semi-supervised Vision Transformers at Scale (Zhaowei Cai, Avinash Ravichandran, Paolo Favaro, Manchen Wang, Davide Modolo, Rahul Bhotika, Zhuowen Tu, Stefano Soatto)

이미지넷 데이터 1%로 top-1 80% 달성. mae + fixmatch + ema teacher + mixup with pseudo label + vit-huge 조합입니다.

성능 변화를 보고 싶어서 정리해봤는데 대략 이렇습니다.

vit-base imagenet 1% mae, finetune: 57.4 mae, ema-teacher: 65.3 mae, ema-teacher, mixup: 71.0

vit-base imagenet 10% scratch, ema-teacher: 68.9 mae, finetune: 73.7 mae, ema-teacher: 78.1 mae, ema-teacher, mixup: 79.7

#semi_supervised_learning