Skip to content

models runwayml stable diffusion v1 5

github-actions[bot] edited this page Oct 23, 2023 · 21 revisions

runwayml-stable-diffusion-v1-5

Overview

runwayml/stable-diffusion-v1-5 is a powerful text-to-image latent diffusion model capable of generating photo-realistic images given any text input. The model uses a fixed pretrained text encoder (CLIP ViT-L/14) as suggested in the Imagen paper, for generating realistic images from text input. Stable-Diffusion-v1-5 model was fine-tuned from an earlier version, stable-diffusion-v1-2, on laion-aesthetics v2.5+ dataset. The model's training process involves encoding images and text prompts, and it uses a reconstruction objective. The model has various applications in research, art, education, and creative tools. However, there are strict guidelines for the model's use to prevent misuse and malicious activities. It should not be used to create harmful, offensive, or discriminatory content. Additionally, the model has limitations, such as difficulties with photorealism, rendering legible text, and generating complex compositions. The model's training data includes the LAION-2B dataset, primarily containing English descriptions, which can lead to biases and limitations in generating non-English content. To enhance safety, a Safety Checker is recommended for use with this model.

The above summary was generated using ChatGPT. Review the original-model-card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model.

Inference samples

Inference type Python sample (Notebook) CLI with YAML
Real time text-to-image-online-endpoint.ipynb text-to-image-online-endpoint.sh
Batch text-to-image-batch-endpoint.ipynb text-to-image-batch-endpoint.sh

Inference with Azure AI Content Safety (AACS) samples

Inference type Python sample (Notebook)
Real time safe-text-to-image-online-deployment.ipynb
Batch safe-text-to-image-batch-endpoint.ipynb

Sample inputs and outputs (for real-time inference)

Sample input

{
   "input_data": {
        "columns": ["prompt"],
        "data": ["a photograph of an astronaut riding a horse", "lion holding hunted deer in grass fields"],
        "index": [0, 1]
    }
}

Sample output

[
    {
        "prompt": "a photograph of an astronaut riding a horse",
        "generated_image": "image1",
        "nsfw_content_detected": False
    },
    {
        "prompt": "lion holding hunted deer in grass fields",
        "generated_image": "image2",
        "nsfw_content_detected": True
    }
]

Note:

  • "image1" and "image2" strings are base64 format.
  • If "nsfw_content_detected" is True then generated image will be totally black.

Model inference: visualization for the prompt - "a photograph of an astronaut riding a horse"

runwayml_stable_diffusion_v1_5 visualization

Version: 4

Tags

Preview license : creativeml-openrail-m task : text-to-image

View in Studio: https://ml.azure.com/registries/azureml/models/runwayml-stable-diffusion-v1-5/version/4

License: creativeml-openrail-m

Properties

SHA: 1d0c4ebf6ff58a5caecab40fa1406526bca4b5b9

datasets: LAION-2B (en)

inference-min-sku-spec: 4|1|28|176

inference-recommended-sku: Standard_NC6s_v3, Standard_NC12s_v3, Standard_NC24s_v3, Standard_NC24rs_v3, Standard_NC16as_T4_v3, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC4as_T4_v3, Standard_NC64as_T4_v3, Standard_NC8as_T4_v3, Standard_NC96ads_A100_v4, Standard_ND40rs_v2, Standard_ND96amsr_A100_v4, Standard_ND96asr_v4

model_id: runwayml/stable-diffusion-v1-5

Clone this wiki locally