Emu: Generative Multimodal Models from BAAI

Emu1 (ICLR 2024, 2023/07) - Generative Pretraining in Multimodality
Emu2 (CVPR 2024, 2023/12) - Generative Multimodal Models are In-Context Learners
Emu3 (arXiv 2024, 2024/09) - Next-Token Prediction is All You Need 🔥🔥🔥

News

2024.9 We introduce Emu3, a new suite of state-of-the-art multimodal models trained solely with next-token prediction. 🔥🔥🔥
2024.2 Emu1 and Emu2 are accepted by ICLR 2024 and CVPR 2024 respectively! 🎉
2023.12 Inference code, model and demo of Emu2 are available. Enjoy the demo.
2023.12 We have released Emu2, open and largest generative multimodal models that achieve new state of the art on multimodal understanding and generation tasks.
2023.7 Inference code and model of Emu are available.
2023.7 We have released Emu, a multimodal generalist that can seamlessly generate images and texts in multimodal context.

Hightlights

State-of-the-art performance
Next-generation capabilities
A base model for diverse tasks

We hope to foster the growth of our community through open-sourcing and promoting collaboration👬. Let's step towards multimodal intelligence together🍻.

Contact

We are hiring at all levels at BAAI Vision Team, including full-time researchers, engineers and interns. If you are interested in working with us on foundation model, visual perception and multimodal learning, please contact Xinlong Wang (wangxinlong@baai.ac.cn).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Emu: Generative Multimodal Models from BAAI

News

Hightlights

Contact

Misc

Files

README.md

Latest commit

History

README.md

File metadata and controls

Emu: Generative Multimodal Models from BAAI

News

Hightlights

Contact

Misc