Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.Net: Using the VectorStoreGenericDataModel when the Key data type is unknown at compile time #9701

Open
f2bo opened this issue Nov 14, 2024 · 4 comments
Assignees
Labels
memory connector .NET Issue or Pull requests regarding .NET code

Comments

@f2bo
Copy link

f2bo commented Nov 14, 2024

Assume a scenario where the vector store record definitions are loaded from a a configuration file. For example:

{
    "collections": [
        "articles":  {
            "Key": "string",
            "Name": "string",
            "Title": "string",     
            "Body": "string",
            "BodyEmbedding": "float[384]"
        },
        "glossary": 
        {
            "Key": "int",
            "Term": "string",
            "Definition": "string",
            "DefinitionEmbedding": "float[1536]"
        }
   ]
}

This file is read at runtime to create a VectorStoreRecordDefinition for a given collection. Notice that the Key property has different data types for each collection, string and int respectively.

string collectionName = "articles";

// Load the definitions from configuration and define a schema for the specified collection
VectorStoreRecordDefinition vectorStoreRecordDefinition = LoadVectorDefinitionForCollection(collectionName);

Once a record definition has been created, it's time to operate on the corresponding collection. The generic data model is meant to be used in scenarios where the database schema is unknown at compile time. However, it's key data type needs to be known at compile time.

// get a reference to the collection
var collection = vectorStore.GetCollection<string, VectorStoreGenericDataModel<??????>>(collectionName, vectorStoreRecordDefinition);

How do you use it when the key data type needs to be specified at runtime. Is there a pattern that you recommend in such a scenario?

@markwallace-microsoft markwallace-microsoft added .NET Issue or Pull requests regarding .NET code triage labels Nov 14, 2024
@github-actions github-actions bot changed the title Using the VectorStoreGenericDataModel when the Key data type is unknown at compile time .Net: Using the VectorStoreGenericDataModel when the Key data type is unknown at compile time Nov 14, 2024
@markwallace-microsoft markwallace-microsoft added question Further information is requested and removed triage labels Nov 14, 2024
@westey-m
Copy link
Contributor

Thanks for the scenario @f2bo, we haven't really considered it before.
A possible solution would be to support object as a key type with the VectorStoreGenericDataModel and casting back and forth to the key type defined in the VectorStoreRecordDefinition. E.g.

var collection = vectorStore.GetCollection<object, VectorStoreGenericDataModel<object>>(collectionName, vectorStoreRecordDefinition);

Note that if the VectorStoreRecordDefinition says that the key type is int, an int would have to be supplied as the key for methods such as GetAsync even though the signature would accept an object. E.g.

object key = 5;
await collection.GetAsync(key);

We would need to do some prototyping on this, and add support in each vector store implementation, but let us know if this would work for your use case.

@westey-m westey-m moved this to Backlog in Semantic Kernel Nov 14, 2024
@westey-m westey-m added memory connector and removed question Further information is requested labels Nov 14, 2024
@f2bo
Copy link
Author

f2bo commented Nov 14, 2024

let us know if this would work for your use case.

I hadn't used the generic data model before and only just started experimenting with an idea when I noticed this problem, so it's too early to tell. It does feel less robust but I imagine that it would work.

I'll report back if I find that using object is impractical.

Thank you!

@westey-m
Copy link
Contributor

It does feel less robust

@f2bo, did you also have another solution in mind or do you just mean object is less robust than using the strongly typed key types?

@f2bo
Copy link
Author

f2bo commented Nov 14, 2024

Sorry. Don't attach too much weight to my comment. I did mean it felt less robust than using a strongly typed key type but I suppose that given that the data type is determined at runtime, you probably can't do better than this. As I said, I don't yet have a complete picture of where I'm going with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
memory connector .NET Issue or Pull requests regarding .NET code
Projects
Status: Backlog
Development

No branches or pull requests

3 participants