Skip to content

Latest commit

 

History

History
263 lines (226 loc) · 34.9 KB

README.md

File metadata and controls

263 lines (226 loc) · 34.9 KB


Awesome Segment Anything Awesome

Segment Anything has led to a new breakthrough in the field of Computer Vision (CV), and this repository will continue to track and summarize the research progress of Segment Anything in various fields, including Papers/Projects, etc.

If you find this repository helpful, please consider Stars ⭐ or Sharing ⬆️. Thanks.

News

- 2024.8.16 Add Segment Anything2 and SaLIP.
- 2023.8.29: Update some recent works.
- 2023.5.20: Update document structure and add a robotic-related article. Happy 520 Day!
- 2023.5.4: Add SEEM.
- 2023.4.18: Add job Inpainting Anything and SAM-Track.
- 2023.4.12: An initial version of recent papers or projects.

Contents

Papers/Projects

Basemodel Papers

Title Presentation Paper page Project page Code base Affiliation Description
CLIP img arXiv Colab Code OpenAI Contrastive Language-Image Pre-Training.
OWL-ViT img ECCV2022 - Code Google A open-vocabulary object detector.
OvSeg img CVPR2023 Project Code META Segment an image into semantic regions according to text descriptions.
Painter img CVPR2023 - Code BAAI A Generalist Painter for In-Context Visual Learning.
Grounding DINO img arXiv Colab &Huggingface Code IDEA A stronger open-set object detector
Segment Anything imgimg arXiv Project page Code Meta A stronger Large model which can be used to generate masks for all objects in an image.
SegGPT img arXiv Project page Code BAAI Segmenting Everything In Context based on Painter.
Segment Everything Everywhere All at Once (SEEM) img arXiv Project Page Code Microsoft Semantic Segmentation with various prompt types.
Segment Everything2 img Paper Project Page Code Meta A foundation model towards solving promptable visual segmentation in images and videos..

Derivative Papers

Analysis and Expansion of SAM

Title Presentation Paper page Project page Code base Affiliation Description
CLIP_Surgery img arXiv Demo Code HKUST This work about SAM based on CLIP's explainability to achieve text to mask without manual points.
GenSAM img arXiv Project Page Code QMUL This work relaxes the requirement for instance-specific prompts in SAM.
Segment Anything Is Not Always Perfect img arXiv - - Samsung This paper analyzes and discusses the benefits and limitations of SAM.
PerSAM img arXiv Project Page Code - Segment Anything with specific concepts.
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching img1 arXiv - Code - One shot semantic segmentation by integrating an all-purpose feature extraction model and a class-agnostic segmentation model.
Segment Anything in High Quality img arXiv Project Page - ETH Zürich & HKUST HQ-SAM: improve segmentation quality of SAM using learnable High-Quality Output Token.
Detect Any Shadow: Segment Anything for Video Shadow Detection img arXiv - Code University of Science and Technology of China Use SAM to detect initial frames then use an LSTM network for subsequent frames.
Fast Segment Anything img arXiv Project Page Code - Reformulate the architecture and improve the speed of SAM.
MobileSAM (Faster Segment Anything) img arXiv Project Page Code Kyung Hee University make SAM mobile-friendly by replacing the heavyweight image encoder with a lightweight one.
FoodSAM (Any Food Segmentation) img arxiv Project Page Code UCAS semantic, instance, panoptic, interactive segmentation on food image.
DefectSAM img arxiv - Code ZJU, Westlake, UESTC, etc. infrared thermal images, defect detection.
SlimSAM img arxiv - Code NUS 0.1% Data Makes Segment Anything Slim.

Medical Image Segmentation

Title Presentation Paper page Project page Code base Affiliation Description
Segment Anything Model (SAM) for Digital Pathology img arXiv - - - SAM + Tumor segmentation/Tissue segmentation/Cell nuclei segmentation.
Segment Anything in Medical Images img1 arXiv - Code - A step-by-step tutorial with a small dataset to help you quickly utilize SAM.
SAM Fails to Segment Anything? img1 arXiv - Code - SAM-adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and More.
Segment Anything Model for Medical Image Analysis: an Experimental Study img1 arXiv - - - Thorough experiments evaluating how SAM performs on 19 medical image datasets.
Medical-SAM-Adapter img1 arXiv - Code - A project to finetune SAM using Adaption for the Medical Imaging.
SAM-Med2d img1 arXiv - Code Sichuan University & Shanghai AI Laboratory The most comprehensive studies on applying SAM to medical 2D images
ScribblePrompt-SAM img1 arXiv Project Page Code MIT & MGH Fine-tuned SAM on 65 biomedical imaging datasets with scribble, click, and bounding box inputs
SaLIP - arXiv Project Page Code - Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero-shot
Medical Image Segmentation.

Bioimage Analysis

Title Presentation Paper page Project page Code base Affiliation Description
Segment Anything for Microscopy img bioRxiv Demo Code University of Göttingen, Germany Segment Anything for Microscopy implements automatic and interactive annotation for microscopy data. It is built on top of Segment Anything and specializes it for microscopy and other bio-imaging data. Its core components are:
  • The micro_sam tools for interactive data annotation with napari.
  • The micro_sam library to apply Segment Anything to 2d and 3d data or fine-tune it on your data.
  • The micro_sam models that are fine-tuned on publicly available microscopy data.
  • Our goal is to build fast and interactive annotation tools for microscopy data

Inpainting

Title Presentation Paper page Project page Code base Affiliation Description
Inpaint Anything img1 arXiv - Code USTC & EIT SAM + Inpainting, which is able to remove the object smoothly.
SAM + Stable Diffusion for Text-to-Image Inpainting img1 - Project Code comet Grounding DINO + SAM + Stable Diffusion

Camouflaged Object Detection

Title Presentation Paper page Project page Code base Affiliation Description
SAMCOD - arXiv - Code - SAM + Camouflaged object detection (COD) task.

Video Frame Interpolation

Title Presentation Paper page Project page Code base Affiliation Description
Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame Interpolation img arXiv Project Page & Interactive Demo Code Shanghai AI Laboratory & Snap Inc. Editable video frame interpolation with SAM.

Low Level Vision

Title Presentation Paper page Project page Code base Affiliation Description
Segment Anything in Video Super-resolution img1 arXiv - - - The first step to use SAM for low-level vision.
SAM-IQA img1 arXiv - Code Megvii The first to introduce the SAM in IQA and demonstrate its strong generalization ability in this domain.

Image Matting

Title Presentation Paper page Project page Code base Affiliation Description
Matte Anything imgimg arXiv - Code HUST Vision Lab An interactive natural image matting system with excellent performance for both opaque and transparent objects
Matting Anything img1 arXiv Project page Code SHI Labs Leverage feature maps from SAM and adopts a Mask-to-Matte module to predict the alpha matte.

Robotic

Title Presentation Paper page Project page Code base Affiliation Description
Instruct2Act img1 arXiv - Code OpenGVLab A SAM application in the Robotic field.

Bioinformatics

Title Presentation Paper page Project page Code base Affiliation Description
IAMSAM img1 bioRxiv - Code Portrai Inc. A SAM application for the analysis of Spatial Transcriptomics.

3D

Title Presentation Paper page Project page Code base Affiliation Description
Point-SAM img arXiv Page Code UCSD An open-world 3D native promptable point-cloud segmentation method.
SAMPro3D img2 arXiv Page Code CUHKSZ, MSRA A novel method to segment any 3D indoor scenes by applying the SAM to 2D frames, without need any training, tuning, distillation or 3D pretrained networks.
Seal img1 arXiv Page Code - A framework capable of leveraging 2D vision foundation models for self-supervised learning on large-scale 3D point clouds.
TomoSAM img arXiv Video Tutorial Code - An extension of 3D Slicer using the SAM to aid the segmentation of 3D data from tomography or other imaging techniques.
SegmentAnythingin3D img arXiv Project Code - A novel framework to Segment Anything in 3D, named SA3D.

Remote Sensing

Title Presentation Paper page Project page Code base Affiliation Description
RSPrompter img arXiv Project Page Code Beihang University An automated instance segmentation approach for remote sensing images based on the SAM.
SAM-CD img arXiv - Code PLA Information Engineering University A sample-efficient change detection framework that employs SAM as the visual encoder.
SAM-Road: Segment Anything Model for Road Network Graph Extraction img arXiv - Code Carnegie Mellon University A simple and fast method applying SAM for vectorized large-scale road network graph extraction. It reaches state-of-the-art accuracy while being 40 times faster.

Tracking

Title Presentation Paper page Project page Code base Affiliation Description
Follow Anything img1 arXiv Page Code MIT, Harvard University an open-vocabulary and multimodal model to detects, tracks, and follows any objects in real-time.
Track-Anything Video arXiv - Code MIT, Harvard University an open-vocabulary and multimodal model to detects, tracks, and follows any objects in real-time.
SAM-Track Video arXiv - Code MIT, Harvard University A framework called Segment And Track Anything (SAMTrack) that allows users to precisely and effectively segment and track any object in a video.

Audio-visual Localization and Segmentation

Title Presentation Paper page Project page Code base Affiliation Description
AV-SAM img1 arXiv - Code CMU A simple yet effective audio-visual localization and segmentation framework based on the SAM.

Adversarial Attacks

Title Presentation Paper page Project page Code base Affiliation Description
Attack-SAM - arXiv - - KAIST The
first work of conduct a comprehensive investigation on how to attack SAM with adversarial
examples.

Derivative Projects

Image Segmentation task

Title Presentation Project page Code base Affiliation Description
Grounded Segment Anything img Colab & Huggingface Code - Combining Grounding DINO and Segment Anything
GroundedSAM Anomaly Detection img - Code - Grounding DINO + SAM to segment any anomaly.
Semantic Segment Anything img - Code Fudan A dense category annotation engine.
Magic Copy img - Code - Magic Copy is a Chrome extension that uses SAM.
YOLO-World + EfficientViT SAM img 🤗 HuggingFace Space Code - Efficient open-vocabulary object detection and segmentation with YOLO-World + EfficientViT SAM
Segment Anything with Clip img 🤗 HuggingFace Space Code - SAM + CLIP
SAM-Clip img - Code - SAM + CLIP.
Prompt Segment Anything img - Code - SAM + Zero-shot Instance Segmentation.
RefSAM - - Code - Evaluating the basic performance of SAM on the Referring Image segmentation task.
SAM-RBox img - Code - An implementation of SAM for generating rotated bounding boxes with MMRotate.
Open Vocabulary Segment Anything img1 - Code - An interesting demo by combining OWL-ViT of Google and SAM.
SegDrawer img1img - Code - Simple static web-based mask drawer, supporting semantic drawing with SAM.
AnyLabeling YoutubeDemo Code - SAM + Labelme + LabelImg + Auto-labeling.
ISAT with segment anything YoutubeDemo BiliBili Demo Code - Labeling tool by SAM(segment anything model),supports SAM, sam-hq, MobileSAM EdgeSAM etc.
Annotation Anything Pipeline img - Code - GPT + SAM.
Roboflow Annotate roboflow-sam-optimized-faster App Blog Roboflow SAM-assisted labeling for training computer vision models.
SALT img - Code - A tool that adds a basic interface for image labeling and saves the generated masks in COCO format.]
SAM U Specify img - Code - Use SAM and CLIP model to segment unique instances you want.]
SAM web UI img App Code - This is a new web interface for the SAM.
Finetune Anything img - Code - A class-aware one-stage tool for training fine-tuning models based on SAM.
NanoSAM img - Code NVIDIA A distilled Segment Anything (SAM) model capable of running real-time with NVIDIA TensorRT.

Video Segmentation task

Title Presentation Project page Code base Affiliation Description
MetaSeg img HuggingFace Code - SAM + Video.
SAM-Track Video YoutubeDemo Code Zhejiang University This project, which is based on SAM and DeAOT, focuses on segmenting and tracking objects in videos.

Medical image Segmentation task

Title Presentation Project page Code base Affiliation Description
SAM in Napari Video - Code - Segment anything with Napari integration of SAM.
SAM Medical Imaging img - Code - SAM for Medical Imaging.

Inpainting task

Title Presentation Project page Code base Affiliation Description
SegAnythingPro img - Code - SAM + Inpainting/Replacing.

3D task

Title Presentation Project page Code base Affiliation Description
3D-Box img - Code - SAM is extended to 3D perception by combining it with VoxelNeXt.
Anything 3DNovel View img - Code - SAM + Zero 1-to-3.
Any 3DFace imgimg - Code - SAM + HRN.
Segment Anything 3D img - Code Pointcept Extending Segment Anything to 3D perception by transferring the segmentation information of 2D images to 3D space

Image Generation task

Title Presentation Project page Code base Affiliation Description
Edit Anything img - Code - Edit and Generate Anything in an image.
Image Edit Anything img - Code - Stable Diffusion + SAM.
SAM for Stable Diffusion Webui img - Code - Stable Diffusion + SAM.

Remote Sensing task

Title Presentation Project page Code base Affiliation Description
Earth Observation Tools img Colab Code - SAM + Remote Sensing.

Moving Object Detection task

Title Presentation Project page Code base Affiliation Description
Moving Object Detection img - Code - SAM + Moving Object Detection.

OCR task

Title Presentation Project page Code base Affiliation Description
OCR-SAM img Blog Code - Optical Character Recognition with SAM.

front-end framework

SAMJS

Title Presentation Project page Code base Affiliation Description
SAMJS samjs demo Code - JS SDK for SAM, Support remote sensing data segmentation and vectorization

Acknowledgement

Some of the presentations in this repository are borrowed from the original author, and we are very thankful for their contribution.

License

This project is released under the MIT license. Please see the LICENSE file for more information.