Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get dataset size in vector db client #417

Open
hust-xing opened this issue Dec 9, 2024 · 1 comment
Open

How to get dataset size in vector db client #417

hust-xing opened this issue Dec 9, 2024 · 1 comment

Comments

@hust-xing
Copy link

I'm adding a new vecoter db client, i need to get the dataset size in optimize function, how can i get it

@alwayslove2013
Copy link
Collaborator

alwayslove2013 commented Dec 10, 2024

@hust-xing Since the optimize function is executed after the insert, it is strongly recommended that you consider querying the amount of inserted data directly from the vector database server. This way, you won't need to modify the VDBBench testing framework.

The second approach is to modify the VDBBench test code by passing the current dataset size self.ca.dataset.data.size to self.db.optimize(). However, this modification may impact the testing of other vector database clients.

@utils.time_it
def _task(self) -> None:
with self.db.init():
self.db.optimize()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants