-
Notifications
You must be signed in to change notification settings - Fork 128
models Salesforce BLIP 2 opt 2 7b vqa
BLIP-2
is a model consisting of three components: a CLIP-like image encoder, a Querying Transformer (Q-Former), and a large language model. The image encoder and language model are initialized from pre-trained checkpoints and kept frozen while training the Querying Transformer. The model's goal is to predict the next text token given query embeddings and previous text, making it useful for tasks such as image captioning, visual question answering, and chat-like conversations. However, the model inherits the same risks and limitations as the off-the-shelf OPT language model it uses, including bias, safety issues, generation diversity issues, and potential vulnerability to inappropriate content or inherent biases in the underlying data. Researchers should carefully assess the safety and fairness of the model before deploying it in any real-world applications.
The above summary was generated using ChatGPT. Review the original-model-card to understand the data used to train the model, evaluation metrics, license, intended uses, limitations and bias before using the model.
Inference type | Python sample (Notebook) | CLI with YAML |
---|---|---|
Real time | visual-question-answering-online-endpoint.ipynb | visual-question-answering-online-endpoint.sh |
Batch | visual-question-answering-batch-endpoint.ipynb | visual-question-answering-batch-endpoint.sh |
{
"input_data":{
"columns":[
"image",
"text"
],
"index":[0, 1],
"data":[
["image1", "What is in the picture? Answer: "],
["image2", "what are people doing? Answer: "]
]
}
}
Note:
- "image1" and "image2" should be publicly accessible urls or strings in
base64
format.
[
{
"text": "a stream in the desert"
},
{
"text": "they're buying coffee"
}
]
For sample image below and text prompt "what are people doing? Answer: ", the output text is "they're buying coffee".
Version: 2
Preview
license : mit
task : visual-question-answering
View in Studio: https://ml.azure.com/registries/azureml/models/Salesforce-BLIP-2-opt-2-7b-vqa/version/2
License: mit
SHA: 6e723d92ee91ebcee4ba74d7017632f11ff4217b
inference-min-sku-spec: 4|0|32|64
inference-recommended-sku: Standard_DS5_v2, Standard_D8a_v4, Standard_D8as_v4, Standard_D16a_v4, Standard_D16as_v4, Standard_D32a_v4, Standard_D32as_v4, Standard_D48a_v4, Standard_D48as_v4, Standard_D64a_v4, Standard_D64as_v4, Standard_D96a_v4, Standard_D96as_v4, Standard_FX4mds, Standard_F8s_v2, Standard_FX12mds, Standard_F16s_v2, Standard_F32s_v2, Standard_F48s_v2, Standard_F64s_v2, Standard_F72s_v2, Standard_FX24mds, Standard_FX36mds, Standard_FX48mds, Standard_E4s_v3, Standard_E8s_v3, Standard_E16s_v3, Standard_E32s_v3, Standard_E48s_v3, Standard_E64s_v3, Standard_NC4as_T4_v3, Standard_NC6s_v3, Standard_NC8as_T4_v3, Standard_NC12s_v3, Standard_NC16as_T4_v3, Standard_NC24s_v3, Standard_NC64as_T4_v3, Standard_NC24ads_A100_v4, Standard_NC48ads_A100_v4, Standard_NC96ads_A100_v4, Standard_ND96asr_v4, Standard_ND96amsr_A100_v4, Standard_ND40rs_v2
model_id: Salesforce/blip2-opt-2.7b