Skip to content

models microsoft swinv2 base patch4 window12 192 22k

github-actions[bot] edited this page Oct 21, 2023 · 28 revisions

microsoft-swinv2-base-patch4-window12-192-22k

Overview

The Swin Transformer is a type of Vision Transformer used in both image classification and dense recognition tasks. It builds hierarchical feature maps by merging image patches in deeper layers and has linear computation complexity to input image size due to computation of self-attention only within each local window. Previous vision Transformers produce feature maps of a single low resolution and have quadratic computation complexity to input image size due to the computation of self-attention globally. Swin Transformer v2 has three main improvements which are a residual-post-norm method, a log-spaced continuous position bias method, and a self-supervised pre-training method called SimMIM. These improvements combined with cosine attention help improve training stability and reduce the need for vast labeled images.

The above summary was generated using ChatGPT. Review the original-model-card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model.

Inference samples

Inference type Python sample (Notebook) CLI with YAML
Real time image-classification-online-endpoint.ipynb image-classification-online-endpoint.sh
Batch image-classification-batch-endpoint.ipynb image-classification-batch-endpoint.sh

Finetuning samples

Task Use case Dataset Python sample (Notebook) CLI with YAML
Image Multi-class classification Image Multi-class classification fridgeObjects fridgeobjects-multiclass-classification.ipynb fridgeobjects-multiclass-classification.sh
Image Multi-label classification Image Multi-label classification multilabel fridgeObjects fridgeobjects-multilabel-classification.ipynb fridgeobjects-multilabel-classification.sh

Model Evaluation

Task Use case Dataset Python sample (Notebook)
Image Multi-class classification Image Multi-class classification fridgeObjects image-multiclass-classification.ipynb
Image Multi-label classification Image Multi-label classification multilabel fridgeObjects image-multilabel-classification.ipynb

Sample inputs and outputs (for real-time inference)

Sample input

{
  "input_data": {
    "columns": [
      "image"
    ],
    "index": [0, 1],
    "data": ["image1", "image2"]
  }
}

Note: "image1" and "image2" string should be in base64 format or publicly accessible urls.

Sample output

[
    {
        "probs": [0.91, 0.09],
        "labels": ["can", "carton"]
    },
    {
        "probs": [0.1, 0.9],
        "labels": ["can", "carton"]
    }
]

Model inference - visualization for a sample image

mc visualization

Version: 10

Tags

Preview license : apache-2.0 model_specific_defaults : ordereddict([('apply_deepspeed', 'true'), ('apply_ort', 'true')]) task : image-classification

View in Studio: https://ml.azure.com/registries/azureml/models/microsoft-swinv2-base-patch4-window12-192-22k/version/10

License: apache-2.0

Properties

SHA: 787136395d17f54db4265d71143193d68107bf49

datasets: imagenet-1k

evaluation-min-sku-spec: 4|1|28|176

evaluation-recommended-sku: Standard_NC6s_v3

finetune-min-sku-spec: 4|1|28|176

finetune-recommended-sku: Standard_NC6s_v3

finetuning-tasks: image-classification

inference-min-sku-spec: 4|0|14|28

inference-recommended-sku: Standard_DS3_v2, Standard_D4a_v4, Standard_D4as_v4, Standard_DS4_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_DS5_v2, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_FX4mds, Standard_F8s_v2, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4, Standard_ND40rs_v2

model_id: microsoft/swinv2-base-patch4-window12-192-22k

Clone this wiki locally