Skip to content

models Salesforce BLIP 2 opt 2 7b image to text

github-actions[bot] edited this page Dec 23, 2023 · 12 revisions

Salesforce-BLIP-2-opt-2-7b-image-to-text

Overview

BLIP-2 is a model consisting of three components: a CLIP-like image encoder, a Querying Transformer (Q-Former), and a large language model. The image encoder and language model are initialized from pre-trained checkpoints and kept frozen while training the Querying Transformer. The model's goal is to predict the next text token given query embeddings and previous text, making it useful for tasks such as image captioning, visual question answering, and chat-like conversations. However, the model inherits the same risks and limitations as the off-the-shelf OPT language model it uses, including bias, safety issues, generation diversity issues, and potential vulnerability to inappropriate content or inherent biases in the underlying data. Researchers should carefully assess the safety and fairness of the model before deploying it in any real-world applications.

The above summary was generated using ChatGPT. Review the original-model-card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model.

Inference samples

Inference type Python sample (Notebook) CLI with YAML
Real time image-to-text-online-endpoint.ipynb image-to-text-online-endpoint.sh
Batch image-to-text-batch-endpoint.ipynb image-to-text-batch-endpoint.sh

Sample inputs and outputs (for real-time inference)

Sample input

{
   "input_data":{
      "columns":[
         "image"
      ],
      "index":[0, 1],
      "data":[
         ["image1"],
         ["image2"]
      ]
   }
}

Note:

  • "image1" and "image2" should be publicly accessible urls or strings in base64 format.

Sample output

[
   {
      "text": "a stream running through a forest with rocks and trees"
   },
   {
      "text": "a grassy hillside with trees and a sunset"
   }
]

Model inference - image to text

For sample image below, the output text is "a grassy hillside with trees and a sunset".

blip2-opt-2.7b image-to-text

Version: 2

Tags

Preview license : mit task : image-to-text SharedComputeCapacityEnabled huggingface_model_id : Salesforce/blip2-opt-2.7b author : Salesforce inference_compute_allow_list : ['Standard_DS5_v2', 'Standard_D8a_v4', 'Standard_D8as_v4', 'Standard_D16a_v4', 'Standard_D16as_v4', 'Standard_D32a_v4', 'Standard_D32as_v4', 'Standard_D48a_v4', 'Standard_D48as_v4', 'Standard_D64a_v4', 'Standard_D64as_v4', 'Standard_D96a_v4', 'Standard_D96as_v4', 'Standard_FX4mds', 'Standard_F8s_v2', 'Standard_FX12mds', 'Standard_F16s_v2', 'Standard_F32s_v2', 'Standard_F48s_v2', 'Standard_F64s_v2', 'Standard_F72s_v2', 'Standard_FX24mds', 'Standard_FX36mds', 'Standard_FX48mds', 'Standard_E4s_v3', 'Standard_E8s_v3', 'Standard_E16s_v3', 'Standard_E32s_v3', 'Standard_E48s_v3', 'Standard_E64s_v3', 'Standard_NC4as_T4_v3', 'Standard_NC6s_v3', 'Standard_NC8as_T4_v3', 'Standard_NC12s_v3', 'Standard_NC16as_T4_v3', 'Standard_NC24s_v3', 'Standard_NC64as_T4_v3', 'Standard_NC24ads_A100_v4', 'Standard_NC48ads_A100_v4', 'Standard_NC96ads_A100_v4', 'Standard_ND96asr_v4', 'Standard_ND96amsr_A100_v4', 'Standard_ND40rs_v2']

View in Studio: https://ml.azure.com/registries/azureml/models/Salesforce-BLIP-2-opt-2-7b-image-to-text/version/2

License: mit

Properties

SharedComputeCapacityEnabled: True

SHA: 6e723d92ee91ebcee4ba74d7017632f11ff4217b

inference-min-sku-spec: 4|0|32|64

inference-recommended-sku: Standard_DS5_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_FX4mds, Standard_F8s_v2, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4, Standard_ND40rs_v2

model_id: Salesforce/blip2-opt-2.7b

Clone this wiki locally