models microsoft deberta base

microsoft-deberta-base

Overview

Description: DeBERTa is a version of the BERT model that has been improved through the use of disentangled attention and enhanced mask decoders. It outperforms BERT and RoBERTa on a majority of NLU tasks using 80GB of training data. It has been fine-tuned on NLU tasks and has achieved dev results on SQuAD 1.1/2.0 and MNLI tasks.If you find the model useful please cite the paper. Please check the official repository for more detailed updates.
Please Note: This model accepts masks in [mask] format. See Sample input for reference. > The above summary was generated using ChatGPT. Review the original model card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model. ### Inference samples Inference type|Python sample (Notebook)|CLI with YAML |--|--|--| Real time|fill-mask-online-endpoint.ipynb|fill-mask-online-endpoint.sh Batch |fill-mask-batch-endpoint.ipynb| coming soon ### Finetuning samples Task|Use case|Dataset|Python sample (Notebook)|CLI with YAML |--|--|--|--|--| Text Classification|Emotion Detection|Emotion|emotion-detection.ipynb|emotion-detection.sh Token Classification|Named Entity Recognition|Conll2003|named-entity-recognition.ipynb|named-entity-recognition.sh Question Answering|Extractive Q&A|SQUAD (Wikipedia)|extractive-qa.ipynb|extractive-qa.sh ### Model Evaluation Task| Use case| Python sample (Notebook)| CLI with YAML |--|--|--|--| Fill Mask | Fill Mask | rcds/wikipedia-for-mask-filling | evaluate-model-fill-mask.ipynb | evaluate-model-fill-mask.yml ### Sample inputs and outputs (for real-time inference) #### Sample input json { "inputs": { "input_string": ["Paris is the [MASK] of France.", "Today is a [MASK] day!"] } } #### Sample output json [ { "0": "capital" }, { "0": "beautiful" } ]

Version: 11

Tags

Preview computes_allow_list : ['Standard_NV12s_v3', 'Standard_NV24s_v3', 'Standard_NV48s_v3', 'Standard_NC6s_v3', 'Standard_NC12s_v3', 'Standard_NC24s_v3', 'Standard_NC24rs_v3', 'Standard_NC6s_v2', 'Standard_NC12s_v2', 'Standard_NC24s_v2', 'Standard_NC24rs_v2', 'Standard_NC4as_T4_v3', 'Standard_NC8as_T4_v3', 'Standard_NC16as_T4_v3', 'Standard_NC64as_T4_v3', 'Standard_ND6s', 'Standard_ND12s', 'Standard_ND24s', 'Standard_ND24rs', 'Standard_ND40rs_v2', 'Standard_ND96asr_v4'] license : mit model_specific_defaults : ordereddict([('apply_deepspeed', 'true'), ('apply_lora', 'true'), ('apply_ort', 'true')]) task : fill-mask

View in Studio: https://ml.azure.com/registries/azureml/models/microsoft-deberta-base/version/11

License: mit

Properties

SHA: 0d1b43ccf21b5acd9f4e5f7b077fa698f05cf195

datasets:

evaluation-min-sku-spec: 8|0|28|56

evaluation-recommended-sku: Standard_DS4_v2

finetune-min-sku-spec: 4|1|28|176

finetune-recommended-sku: Standard_NC24rs_v3

finetuning-tasks: text-classification, token-classification, question-answering

inference-min-sku-spec: 2|0|14|28

inference-recommended-sku: Standard_DS3_v2, Standard_D4a_v4, Standard_D4as_v4, Standard_DS4_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_DS5_v2, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_FX4mds, Standard_F8s_v2, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E2s_v3, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4, Standard_ND40rs_v2

languages: en

Wiki menu

Home
Reference Documentation
- Components
- Data
- Environments
- Models
Contributing

Provide feedback

Saved searches