Skip to content

Commit

Permalink
Fix wording, example with two mapped dimensions does not support NN/HNSW
Browse files Browse the repository at this point in the history
  • Loading branch information
Jo Kristian Bergum committed Sep 23, 2024
1 parent 2d380fe commit 5e3d513
Showing 1 changed file with 33 additions and 72 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,11 @@
"This is a guide on how to use the [ColBERT](https://github.com/stanford-futuredata/ColBERT) package to produce token-level\n",
"vectors. This as an alternative for using the native Vespa [colbert embedder](https://docs.vespa.ai/en/embedding.html#colbert-embedder).\n",
"\n",
"This guide illustrates how to feed multiple passages per document (long-context)\n",
"This guide illustrates how to feed multiple passages per Vespa document (long-context)\n",
"\n",
"- Compress token vectors using binarization compatible with Vespa unpackbits\n",
"- Use Vespa hex feed format for binary vectors with mixed vespa tensors\n",
"- How to query\n",
"- How to query Vespa with the colbert query tensor representation\n",
"\n",
"Read more about [Vespa Long-Context ColBERT](https://blog.vespa.ai/announcing-long-context-colbert-in-vespa/).\n",
"\n",
Expand Down Expand Up @@ -51,19 +51,10 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": null,
"id": "9ad221c5",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/opt/homebrew/lib/python3.11/site-packages/torch/cuda/amp/grad_scaler.py:126: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available. Disabling.\n",
" warnings.warn(\n"
]
}
],
"outputs": [],
"source": [
"from colbert.modeling.checkpoint import Checkpoint\n",
"from colbert.infra import ColBERTConfig\n",
Expand All @@ -73,6 +64,14 @@
")"
]
},
{
"cell_type": "markdown",
"id": "93efc596",
"metadata": {},
"source": [
"A few sample documents:"
]
},
{
"cell_type": "code",
"execution_count": 50,
Expand All @@ -90,23 +89,22 @@
},
{
"cell_type": "code",
"execution_count": 51,
"execution_count": null,
"id": "4b8154eb",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/opt/homebrew/lib/python3.11/site-packages/torch/amp/autocast_mode.py:250: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling\n",
" warnings.warn(\n"
]
}
],
"outputs": [],
"source": [
"document_token_vectors = ckpt.docFromText(document_passages)"
]
},
{
"cell_type": "markdown",
"id": "23b2e1f4",
"metadata": {},
"source": [
"See the shape of the colbert document embeddings:"
]
},
{
"cell_type": "code",
"execution_count": 52,
Expand Down Expand Up @@ -185,7 +183,7 @@
" values = str(hexlify(token_vectors[token_index].tobytes()), \"utf-8\")\n",
" if (\n",
" values == \"00000000000000000000000000000000\"\n",
" ): # skip empty vectors due to padding of batch\n",
" ): # skip empty vectors due to padding with batch of passages\n",
" continue\n",
" vespa_tensor_cell = {\n",
" \"address\": {\"context\": chunk_index, \"token\": token_index},\n",
Expand Down Expand Up @@ -228,9 +226,7 @@
"[PyVespa](https://pyvespa.readthedocs.io/en/latest/) helps us build the [Vespa application package](https://docs.vespa.ai/en/application-packages.html).\n",
"A Vespa application package consists of configuration files, schemas, models, and code (plugins).\n",
"\n",
"First, we define a [Vespa schema](https://docs.vespa.ai/en/schemas.html) with the fields we want to store and their type.\n",
"\n",
"We use HNSW with hamming distance for retrieval\n"
"First, we define a [Vespa schema](https://docs.vespa.ai/en/schemas.html) with the fields we want to store and their type.\n"
]
},
{
Expand Down Expand Up @@ -287,7 +283,9 @@
"id": "5ea4ff0d",
"metadata": {},
"source": [
"Note that we just use max sim in the first phase ranking over all the hits that are retrieved by the query\n"
"Note that we use max sim in the first phase ranking over all \n",
"the hits that are retrieved by the query logic. Also note that asymmetric MaxSim where we \n",
"use `unpack_bits` to obtain a 128-d float vector representation from the binary vector representation. \n"
]
},
{
Expand Down Expand Up @@ -392,7 +390,7 @@
},
{
"cell_type": "code",
"execution_count": 64,
"execution_count": null,
"id": "fe954dc4",
"metadata": {
"colab": {
Expand All @@ -401,45 +399,7 @@
"id": "fe954dc4",
"outputId": "a0764bd3-98c2-492a-b8d9-b99ecacf4bdb"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Deployment started in run 3 of dev-aws-us-east-1c for samples.colbertlong. This may take a few minutes the first time.\n",
"INFO [19:49:37] Deploying platform version 8.324.16 and application dev build 3 for dev-aws-us-east-1c of default ...\n",
"INFO [19:49:37] Using CA signed certificate version 0\n",
"INFO [19:49:46] Using 1 nodes in container cluster 'colbertlong_container'\n",
"INFO [19:49:51] Session 2737 for tenant 'samples' prepared and activated.\n",
"INFO [19:49:52] ######## Details for all nodes ########\n",
"INFO [19:49:52] h88976a.dev.aws-us-east-1c.vespa-external.aws.oath.cloud: expected to be UP\n",
"INFO [19:49:52] --- platform vespa/cloud-tenant-rhel8:8.324.16\n",
"INFO [19:49:52] --- logserver-container on port 4080 has config generation 2737, wanted is 2737\n",
"INFO [19:49:52] --- metricsproxy-container on port 19092 has config generation 2737, wanted is 2737\n",
"INFO [19:49:52] h88976b.dev.aws-us-east-1c.vespa-external.aws.oath.cloud: expected to be UP\n",
"INFO [19:49:52] --- platform vespa/cloud-tenant-rhel8:8.324.16\n",
"INFO [19:49:52] --- container-clustercontroller on port 19050 has config generation 2737, wanted is 2737\n",
"INFO [19:49:52] --- metricsproxy-container on port 19092 has config generation 2737, wanted is 2737\n",
"INFO [19:49:52] h90246b.dev.aws-us-east-1c.vespa-external.aws.oath.cloud: expected to be UP\n",
"INFO [19:49:52] --- platform vespa/cloud-tenant-rhel8:8.324.16\n",
"INFO [19:49:52] --- storagenode on port 19102 has config generation 2737, wanted is 2737\n",
"INFO [19:49:52] --- searchnode on port 19107 has config generation 2737, wanted is 2737\n",
"INFO [19:49:52] --- distributor on port 19111 has config generation 2737, wanted is 2737\n",
"INFO [19:49:52] --- metricsproxy-container on port 19092 has config generation 2737, wanted is 2737\n",
"INFO [19:49:52] h91714a.dev.aws-us-east-1c.vespa-external.aws.oath.cloud: expected to be UP\n",
"INFO [19:49:52] --- platform vespa/cloud-tenant-rhel8:8.324.16\n",
"INFO [19:49:52] --- container on port 4080 has config generation 2737, wanted is 2737\n",
"INFO [19:49:52] --- metricsproxy-container on port 19092 has config generation 2737, wanted is 2737\n",
"INFO [19:49:52] Found endpoints:\n",
"INFO [19:49:52] - dev.aws-us-east-1c\n",
"INFO [19:49:52] |-- https://a40c1bad.ab8eabb6.z.vespa-app.cloud/ (cluster 'colbertlong_container')\n",
"INFO [19:49:52] Installation succeeded!\n",
"Using mTLS (key,cert) Authentication against endpoint https://a40c1bad.ab8eabb6.z.vespa-app.cloud//ApplicationStatus\n",
"Application is up!\n",
"Finished deployment.\n"
]
}
],
"outputs": [],
"source": [
"from vespa.application import Vespa\n",
"\n",
Expand Down Expand Up @@ -468,6 +428,7 @@
" \"passages\": document_passages,\n",
" \"colbert\": {\"blocks\": binarize_token_vectors_hex(document_token_vectors)},\n",
"}\n",
"# synchrounous feed (this is blocking and slow, but few docs..)\n",
"with app.syncio() as sync:\n",
" response: VespaResponse = sync.feed_data_point(\n",
" data_id=1, fields=vespa_feed_format, schema=\"doc\"\n",
Expand All @@ -479,7 +440,7 @@
"id": "cebada8d",
"metadata": {},
"source": [
"## Querying\n"
"### Querying Vespa with colbert tensors \n"
]
},
{
Expand Down Expand Up @@ -3061,7 +3022,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.12.4"
},
"vscode": {
"interpreter": {
Expand All @@ -3071,4 +3032,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}

0 comments on commit 5e3d513

Please sign in to comment.