Change the repository type filter
All
Repositories list
68 repositories
InternVL
Public[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型GUI-Odyssey
PublicVision-RWKV
PublicPIIP
Public[NeurIPS 2024 Spotlight ⭐️] Parameter-Inverted Image Pyramid Networks (PIIP)OV-OAD
Public- Train InternViT-6B in MMSegmentation and MMDetection with DeepSpeed
PhyGenBench
PublicMM-NIAH
Public[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of existing MLLMs to comprehend long multimodal documents.VisionLLM
PublicVisionLLM SeriesVideoMAEv2
Public[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual MaskingEfficientQAT
PublicOmniQuant
Public[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.STM-Evaluation
PublicInternVideo
Public[ECCV2024] Video Foundation Models & Data for Multimodal UnderstandingMMIU
PublicChartAst
PublicAsk-Anything
Public[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.EgoExoLearn
PublicInternGPT
PublicInternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)all-seeing
Public[ICLR 2024 & ECCV 2024] The All-Seeing Projects: Towards Panoptic Visual Recognition&Understanding and General Relation Comprehension of the Open World"Diffree
PublicInternImage
Public[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable ConvolutionsMMT-Bench
PublicControlLLM
PublicHumanBench
PublicVideoMamba
PublicLORIS
PublicLong-Term Rhythmic Video Soundtracker, ICML2023EgoVideo
Public