Skip to content

models gpt2

github-actions[bot] edited this page Nov 17, 2023 · 25 revisions

gpt2

Overview

GPT-2 is a transformer-based language model intended for AI researchers and practitioners. It was trained on unfiltered content from Reddit and may have biases. It is best used for text generation, but the training data has not been publicly released. It has several limitations and should be used with caution in situations that require truth and in systems that interact with humans. There are different versions of the model, including GPT-Large, GPT-Medium, and GPT-Xl, available for different use cases. The information provided by the OpenAI team is to complete and give specific examples of bias in their model.

The above summary was generated using ChatGPT. Review the original model card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model.

Inference samples

Inference type Python sample (Notebook) CLI with YAML
Real time text-generation-online-endpoint.ipynb text-generation-online-endpoint.sh
Batch text-generation-batch-endpoint.ipynb coming soon

Finetuning samples

Task Use case Dataset Python sample (Notebook) CLI with YAML
Text Classification Emotion Detection Emotion emotion-detection.ipynb emotion-detection.sh
Token Classification Named Entity Recognition Conll2003 named-entity-recognition.ipynb named-entity-recognition.sh

Model Evaluation

Task Use case Dataset Python sample (Notebook) CLI with YAML
Text generation Text generation cnn_dailymail evaluate-model-text-generation.ipynb evaluate-model-text-generation.yml

Sample inputs and outputs (for real-time inference)

Sample input

{
    "input_data": {
        "input_string": ["My name is John and I am", "Once upon a time,"]
    }
}

Sample output

[
    {
        "0": "My name is John and I am a student at UC Berkeley. It is my main interest to do research in the humanities. I am going to share"
    },
    {
        "0": "Once upon a time, they were just another small family, only three. She says one day that her father was getting a new license"
    }
]

Version: 10

Tags

Preview computes_allow_list : ['Standard_NV12s_v3', 'Standard_NV24s_v3', 'Standard_NV48s_v3', 'Standard_NC6s_v3', 'Standard_NC12s_v3', 'Standard_NC24s_v3', 'Standard_NC24rs_v3', 'Standard_NC6s_v2', 'Standard_NC12s_v2', 'Standard_NC24s_v2', 'Standard_NC24rs_v2', 'Standard_NC4as_T4_v3', 'Standard_NC8as_T4_v3', 'Standard_NC16as_T4_v3', 'Standard_NC64as_T4_v3', 'Standard_ND6s', 'Standard_ND12s', 'Standard_ND24s', 'Standard_ND24rs', 'Standard_ND40rs_v2', 'Standard_ND96asr_v4'] license : mit model_specific_defaults : ordereddict({'apply_deepspeed': 'true', 'apply_lora': 'true', 'apply_ort': 'true'}) task : text-generation

View in Studio: https://ml.azure.com/registries/azureml/models/gpt2/version/10

License: mit

Properties

SHA: 11c5a3d5811f50298f278a704980280950aedb10

datasets:

evaluation-min-sku-spec: 8|0|28|56

evaluation-recommended-sku: Standard_DS4_v2

finetune-min-sku-spec: 4|1|28|176

finetune-recommended-sku: Standard_NC12s_v3

finetuning-tasks: text-classification, token-classification

inference-min-sku-spec: 2|0|7|14

inference-recommended-sku: Standard_DS2_v2, Standard_D2a_v4, Standard_D2as_v4, Standard_DS3_v2, Standard_D4a_v4, Standard_D4as_v4, Standard_DS4_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_DS5_v2, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_F4s_v2, Standard_FX4mds, Standard_F8s_v2, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E2s_v3, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4, Standard_ND40rs_v2

languages: en

Clone this wiki locally