Skip to content

models microsoft deberta large mnli

github-actions[bot] edited this page Jan 12, 2024 · 22 revisions

microsoft-deberta-large-mnli

Overview

DeBERTa (Decoding-enhanced BERT with Disentangled Attention) improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. It outperforms BERT and RoBERTa on majority of NLU tasks with 80GB training data.

Please check the official repository for more details and updates.

This is the DeBERTa large model fine-tuned with MNLI task.

Evaluation Results

We present the dev results on SQuAD 1.1/2.0 and several GLUE benchmark tasks.

Model SQuAD 1.1 SQuAD 2.0 MNLI-m/mm SST-2 QNLI CoLA RTE MRPC QQP STS-B
F1/EM F1/EM Acc Acc Acc MCC Acc Acc/F1 Acc/F1 P/S
BERT-Large 90.9/84.1 81.8/79.0 86.6/- 93.2 92.3 60.6 70.4 88.0/- 91.3/- 90.0/-
RoBERTa-Large 94.6/88.9 89.4/86.5 90.2/- 96.4 93.9 68.0 86.6 90.9/- 92.2/- 92.4/-
XLNet-Large 95.1/89.7 90.6/87.9 90.8/- 97.0 94.9 69.0 85.9 90.8/- 92.3/- 92.5/-
DeBERTa-Large1 95.5/90.1 90.7/88.0 91.3/91.1 96.5 95.3 69.5 91.0 92.6/94.6 92.3/- 92.8/92.5
DeBERTa-XLarge1 -/- -/- 91.5/91.2 97.0 - - 93.1 92.1/94.3 - 92.9/92.7
DeBERTa-V2-XLarge1 95.8/90.8 91.4/88.9 91.7/91.6 97.5 95.8 71.1 93.9 92.0/94.2 92.3/89.8 92.9/92.9
DeBERTa-V2-XXLarge1,2 96.1/91.4 92.2/89.7 91.7/91.9 97.2 96.0 72.0 93.5 93.1/94.9 92.7/90.3 93.2/93.1

Inference samples

Inference type Python sample (Notebook)
Real time sdk-example.ipynb
Real time text-classification-online-endpoint.ipynb

Sample inputs and outputs

Sample input

{
    "input_data": [
        "Today was an amazing day!",
        "It was an unfortunate series of events."
    ]
}

Sample output

[
  {
    "label": "NEUTRAL",
    "score": 0.9605958461761475
  },
  {
    "label": "NEUTRAL",
    "score": 0.98270583152771
  }
]

Version: 13

Tags

license : mit SharedComputeCapacityEnabled task : text-classification huggingface_model_id : microsoft/deberta-large-mnli inference_compute_allow_list : ['Standard_DS3_v2', 'Standard_D4a_v4', 'Standard_D4as_v4', 'Standard_DS4_v2', 'Standard_D8a_v4', 'Standard_D8as_v4', 'Standard_DS5_v2', 'Standard_D16a_v4', 'Standard_D16as_v4', 'Standard_D32a_v4', 'Standard_D32as_v4', 'Standard_D48a_v4', 'Standard_D48as_v4', 'Standard_D64a_v4', 'Standard_D64as_v4', 'Standard_D96a_v4', 'Standard_D96as_v4', 'Standard_F4s_v2', 'Standard_FX4mds', 'Standard_F8s_v2', 'Standard_FX12mds', 'Standard_F16s_v2', 'Standard_F32s_v2', 'Standard_F48s_v2', 'Standard_F64s_v2', 'Standard_F72s_v2', 'Standard_FX24mds', 'Standard_FX36mds', 'Standard_FX48mds', 'Standard_E2s_v3', 'Standard_E4s_v3', 'Standard_E8s_v3', 'Standard_E16s_v3', 'Standard_E32s_v3', 'Standard_E48s_v3', 'Standard_E64s_v3', 'Standard_NC4as_T4_v3', 'Standard_NC6s_v3', 'Standard_NC8as_T4_v3', 'Standard_NC12s_v3', 'Standard_NC16as_T4_v3', 'Standard_NC24s_v3', 'Standard_NC64as_T4_v3', 'Standard_ND40rs_v2', 'Standard_NC24ads_A100_v4', 'Standard_NC48ads_A100_v4', 'Standard_NC96ads_A100_v4', 'Standard_ND96asr_v4', 'Standard_ND96amsr_A100_v4'] evaluation_compute_allow_list : ['Standard_DS4_v2', 'Standard_D8a_v4', 'Standard_D8as_v4', 'Standard_DS5_v2', 'Standard_DS12_v2', 'Standard_D16a_v4', 'Standard_D16as_v4', 'Standard_D32a_v4', 'Standard_D32as_v4', 'Standard_D48a_v4', 'Standard_D48as_v4', 'Standard_D64a_v4', 'Standard_D64as_v4', 'Standard_D96a_v4', 'Standard_D96as_v4', 'Standard_FX4mds', 'Standard_FX12mds', 'Standard_F16s_v2', 'Standard_F32s_v2', 'Standard_F48s_v2', 'Standard_F64s_v2', 'Standard_F72s_v2', 'Standard_FX24mds', 'Standard_FX36mds', 'Standard_FX48mds', 'Standard_E4s_v3', 'Standard_E8s_v3', 'Standard_E16s_v3', 'Standard_E32s_v3', 'Standard_E48s_v3', 'Standard_E64s_v3', 'Standard_NC4as_T4_v3', 'Standard_NC6s_v3', 'Standard_NC8as_T4_v3', 'Standard_NC12s_v3', 'Standard_NC16as_T4_v3', 'Standard_NC24s_v3', 'Standard_NC64as_T4_v3', 'Standard_NC24ads_A100_v4', 'Standard_NC48ads_A100_v4', 'Standard_NC96ads_A100_v4', 'Standard_ND96asr_v4', 'Standard_ND96amsr_A100_v4', 'Standard_ND40rs_v2']

View in Studio: https://ml.azure.com/registries/azureml/models/microsoft-deberta-large-mnli/version/13

License: mit

Properties

SharedComputeCapacityEnabled: True

SHA: 7296194b9009373def4f7c5dad292651e4b5cf4e

evaluation-min-sku-spec: 4|0|28|56

evaluation-recommended-sku: Standard_DS4_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_DS5_v2, Standard_DS12_v2, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_FX4mds, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_ND40rs_v2, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4

inference-min-sku-spec: 2|0|8|28

inference-recommended-sku: Standard_DS3_v2, Standard_D4a_v4, Standard_D4as_v4, Standard_DS4_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_DS5_v2, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_F4s_v2, Standard_FX4mds, Standard_F8s_v2, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E2s_v3, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_ND40rs_v2, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4

languages: en

Clone this wiki locally