Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] Embedding types support in neural search #1006

Open
zane-neo opened this issue Dec 9, 2024 · 4 comments
Open

[FEATURE] Embedding types support in neural search #1006

zane-neo opened this issue Dec 9, 2024 · 4 comments
Assignees

Comments

@zane-neo
Copy link
Collaborator

zane-neo commented Dec 9, 2024

Is your feature request related to a problem?

BedRock titan model V2 and Cohere V2 model now supports multiple embedding types in their API, which means you can specify different embedding types as a list in the request, and model generates different results based on embedding types in response, e.g.:

import cohere
import requests
import base64

co = cohere.ClientV2()

image = requests.get("https://cohere.com/favicon-32x32.png")
stringified_buffer = base64.b64encode(image.content).decode("utf-8")
content_type = image.headers["Content-Type"]
image_base64 = f"data:{content_type};base64,{stringified_buffer}"

response = co.embed(
    model="embed-english-v3.0",
    input_type="image",
    embedding_types=["float", "binary", "int8"],
    images=[image_base64],
)

print(response)

Response:

{
  "id": "5807ee2e-0cda-445a-9ec8-864c60a06606",
  "embeddings": {
    "float": [
      [
        -0.007247925,
        -0.041229248,
        -0.023223877,
       ......
      ]
    ],
    "binary": [
        [
            12,
            -13,
            14
        ]
    ],
    "int8": [
        [
            19,
            20,
            21
        ]
    ]
  },
  "texts": [],
  "images": [
    {
      "width": 400,
      "height": 400,
      "format": "jpeg",
      "bit_depth": 24
    }
  ],
  "meta": {
    "api_version": {
      "version": "2"
    },
    "billed_units": {
      "images": 1
    }
  }
}

But in neural search we hard coded the embedding result's type to FLOAT, we need to support more different numbers type for model response.

What solution would you like?

A clear and concise description of what you want to happen.

What alternatives have you considered?

A clear and concise description of any alternative solutions or features you've considered.

Do you have any additional context?

Add any other context or screenshots about the feature request here.

@martin-gaievski
Copy link
Member

How those different data types will be used later in OpenSearch? In case of data ingestion are we going to store embeddings using the type returned by the model?

@zane-neo
Copy link
Collaborator Author

How those different data types will be used later in OpenSearch? In case of data ingestion are we going to store embeddings using the type returned by the model?

Yes, InferenceProcessor already treated the result as a generic type so the only thing need to do is to change the underlying code to ensure the number types not been casted to float.
For now, k-NN supports float and byte type, so adding this change at least we can support the byte case, in the future k-NN might support more different types and in neural-search there's no code effort on this.

@martin-gaievski
Copy link
Member

@zane-neo Understood, can you give me example of end to end use case? It looks like everything should be aligned for this new data type of model embeddings - index mapping, model metadata/remote connector, anything else I'm missing. Today when the type is only float we either work or fail, are we going to create a default fall back logic if one of the pieces is incorrect?

@zane-neo
Copy link
Collaborator Author

An example can be found here: https://opensearch.org/docs/latest/search-plugins/semantic-search/. User should make sure things are correctly matching for ingestion and query by using the same model and same embedding types which can be done by adding the embedding types configuration to the model's connector. We don't consider index mapping etc as user should be familiar with this if they want to use text embedding in the index. As currently for remote model the request body template is predefined so using same model in ingestion and query can guarantee things working correctly, so we don't need default fallback logic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants