Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
pgvector: ensure vector is sent in binary representation
PostgreSQL supports two methods of passing data from client to server: text and binary. While for many data types the difference may not be noticeable, we can see significant performance impact when converting a vector from binary => text => binary representation. See previous explanation here[1]. While the pgvector loading code accounts for this, the query code did not. This is due to the use of a list[float] type, which the pgvector-python adapter currently doesn't support. However, this adapter does support direct binary transfer if the data is represent as a Numpy array[2]. Testing shows that moving to a direct binary representation does have a significant impact on query results - my tests are showing a 3x impact -- and provides a more accurate representation for how this workload would execute. [1] erikbern/ann-benchmarks#488 [2] https://github.com/pgvector/pgvector-python?tab=readme-ov-file#psycopg-3
- Loading branch information