Fix wording, example with two mapped dimensions does not support NN/HNSW

vespa-engine · Sep 23, 2024 · 5e3d513 · 5e3d513
1 parent 2d380fe
commit 5e3d513
Showing 1 changed file with 33 additions and 72 deletions.
diff --git a/docs/sphinx/source/examples/colbert_standalone_long_context_Vespa-cloud.ipynb b/docs/sphinx/source/examples/colbert_standalone_long_context_Vespa-cloud.ipynb
@@ -18,11 +18,11 @@
                 "This is a guide on how to use the [ColBERT](https://github.com/stanford-futuredata/ColBERT) package to produce token-level\n",
                 "vectors. This as an alternative for using the native Vespa [colbert embedder](https://docs.vespa.ai/en/embedding.html#colbert-embedder).\n",
                 "\n",
-                "This guide illustrates how to feed multiple passages per document (long-context)\n",
+                "This guide illustrates how to feed multiple passages per Vespa document (long-context)\n",
                 "\n",
                 "- Compress token vectors using binarization compatible with Vespa unpackbits\n",
                 "- Use Vespa hex feed format for binary vectors with mixed vespa tensors\n",
-                "- How to query\n",
+                "- How to query Vespa with the colbert query tensor representation\n",
                 "\n",
                 "Read more about [Vespa Long-Context ColBERT](https://blog.vespa.ai/announcing-long-context-colbert-in-vespa/).\n",
                 "\n",
@@ -51,19 +51,10 @@
         },
         {
             "cell_type": "code",
-            "execution_count": 11,
+            "execution_count": null,
             "id": "9ad221c5",
             "metadata": {},
-            "outputs": [
-                {
-                    "name": "stderr",
-                    "output_type": "stream",
-                    "text": [
-                        "/opt/homebrew/lib/python3.11/site-packages/torch/cuda/amp/grad_scaler.py:126: UserWarning: torch.cuda.amp.GradScaler is enabled, but CUDA is not available.  Disabling.\n",
-                        "  warnings.warn(\n"
-                    ]
-                }
-            ],
+            "outputs": [],
             "source": [
                 "from colbert.modeling.checkpoint import Checkpoint\n",
                 "from colbert.infra import ColBERTConfig\n",
@@ -73,6 +64,14 @@
                 ")"
             ]
         },
+        {
+            "cell_type": "markdown",
+            "id": "93efc596",
+            "metadata": {},
+            "source": [
+                "A few sample documents:"
+            ]
+        },
         {
             "cell_type": "code",
             "execution_count": 50,
@@ -90,23 +89,22 @@
         },
         {
             "cell_type": "code",
-            "execution_count": 51,
+            "execution_count": null,
             "id": "4b8154eb",
             "metadata": {},
-            "outputs": [
-                {
-                    "name": "stderr",
-                    "output_type": "stream",
-                    "text": [
-                        "/opt/homebrew/lib/python3.11/site-packages/torch/amp/autocast_mode.py:250: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling\n",
-                        "  warnings.warn(\n"
-                    ]
-                }
-            ],
+            "outputs": [],
             "source": [
                 "document_token_vectors = ckpt.docFromText(document_passages)"
             ]
         },
+        {
+            "cell_type": "markdown",
+            "id": "23b2e1f4",
+            "metadata": {},
+            "source": [
+                "See the shape of the colbert document embeddings:"
+            ]
+        },
         {
             "cell_type": "code",
             "execution_count": 52,
@@ -185,7 +183,7 @@
                 "            values = str(hexlify(token_vectors[token_index].tobytes()), \"utf-8\")\n",
                 "            if (\n",
                 "                values == \"00000000000000000000000000000000\"\n",
-                "            ):  # skip empty vectors due to padding of batch\n",
+                "            ):  # skip empty vectors due to padding with batch of passages\n",
                 "                continue\n",
                 "            vespa_tensor_cell = {\n",
                 "                \"address\": {\"context\": chunk_index, \"token\": token_index},\n",
@@ -228,9 +226,7 @@
                 "[PyVespa](https://pyvespa.readthedocs.io/en/latest/) helps us build the [Vespa application package](https://docs.vespa.ai/en/application-packages.html).\n",
                 "A Vespa application package consists of configuration files, schemas, models, and code (plugins).\n",
                 "\n",
-                "First, we define a [Vespa schema](https://docs.vespa.ai/en/schemas.html) with the fields we want to store and their type.\n",
-                "\n",
-                "We use HNSW with hamming distance for retrieval\n"
+                "First, we define a [Vespa schema](https://docs.vespa.ai/en/schemas.html) with the fields we want to store and their type.\n"
             ]
         },
         {
@@ -287,7 +283,9 @@
             "id": "5ea4ff0d",
             "metadata": {},
             "source": [
-                "Note that we just use max sim in the first phase ranking over all the hits that are retrieved by the query\n"
+                "Note that we use max sim in the first phase ranking over all \n",
+                "the hits that are retrieved by the query logic. Also note that asymmetric MaxSim where we \n",
+                "use `unpack_bits` to obtain a 128-d float vector representation from the binary vector representation. \n"
             ]
         },
         {
@@ -392,7 +390,7 @@
         },
         {
             "cell_type": "code",
-            "execution_count": 64,
+            "execution_count": null,
             "id": "fe954dc4",
             "metadata": {
                 "colab": {
@@ -401,45 +399,7 @@
                 "id": "fe954dc4",
                 "outputId": "a0764bd3-98c2-492a-b8d9-b99ecacf4bdb"
             },
-            "outputs": [
-                {
-                    "name": "stdout",
-                    "output_type": "stream",
-                    "text": [
-                        "Deployment started in run 3 of dev-aws-us-east-1c for samples.colbertlong. This may take a few minutes the first time.\n",
-                        "INFO    [19:49:37]  Deploying platform version 8.324.16 and application dev build 3 for dev-aws-us-east-1c of default ...\n",
-                        "INFO    [19:49:37]  Using CA signed certificate version 0\n",
-                        "INFO    [19:49:46]  Using 1 nodes in container cluster 'colbertlong_container'\n",
-                        "INFO    [19:49:51]  Session 2737 for tenant 'samples' prepared and activated.\n",
-                        "INFO    [19:49:52]  ######## Details for all nodes ########\n",
-                        "INFO    [19:49:52]  h88976a.dev.aws-us-east-1c.vespa-external.aws.oath.cloud: expected to be UP\n",
-                        "INFO    [19:49:52]  --- platform vespa/cloud-tenant-rhel8:8.324.16\n",
-                        "INFO    [19:49:52]  --- logserver-container on port 4080 has config generation 2737, wanted is 2737\n",
-                        "INFO    [19:49:52]  --- metricsproxy-container on port 19092 has config generation 2737, wanted is 2737\n",
-                        "INFO    [19:49:52]  h88976b.dev.aws-us-east-1c.vespa-external.aws.oath.cloud: expected to be UP\n",
-                        "INFO    [19:49:52]  --- platform vespa/cloud-tenant-rhel8:8.324.16\n",
-                        "INFO    [19:49:52]  --- container-clustercontroller on port 19050 has config generation 2737, wanted is 2737\n",
-                        "INFO    [19:49:52]  --- metricsproxy-container on port 19092 has config generation 2737, wanted is 2737\n",
-                        "INFO    [19:49:52]  h90246b.dev.aws-us-east-1c.vespa-external.aws.oath.cloud: expected to be UP\n",
-                        "INFO    [19:49:52]  --- platform vespa/cloud-tenant-rhel8:8.324.16\n",
-                        "INFO    [19:49:52]  --- storagenode on port 19102 has config generation 2737, wanted is 2737\n",
-                        "INFO    [19:49:52]  --- searchnode on port 19107 has config generation 2737, wanted is 2737\n",
-                        "INFO    [19:49:52]  --- distributor on port 19111 has config generation 2737, wanted is 2737\n",
-                        "INFO    [19:49:52]  --- metricsproxy-container on port 19092 has config generation 2737, wanted is 2737\n",
-                        "INFO    [19:49:52]  h91714a.dev.aws-us-east-1c.vespa-external.aws.oath.cloud: expected to be UP\n",
-                        "INFO    [19:49:52]  --- platform vespa/cloud-tenant-rhel8:8.324.16\n",
-                        "INFO    [19:49:52]  --- container on port 4080 has config generation 2737, wanted is 2737\n",
-                        "INFO    [19:49:52]  --- metricsproxy-container on port 19092 has config generation 2737, wanted is 2737\n",
-                        "INFO    [19:49:52]  Found endpoints:\n",
-                        "INFO    [19:49:52]  - dev.aws-us-east-1c\n",
-                        "INFO    [19:49:52]   |-- https://a40c1bad.ab8eabb6.z.vespa-app.cloud/ (cluster 'colbertlong_container')\n",
-                        "INFO    [19:49:52]  Installation succeeded!\n",
-                        "Using mTLS (key,cert) Authentication against endpoint https://a40c1bad.ab8eabb6.z.vespa-app.cloud//ApplicationStatus\n",
-                        "Application is up!\n",
-                        "Finished deployment.\n"
-                    ]
-                }
-            ],
+            "outputs": [],
             "source": [
                 "from vespa.application import Vespa\n",
                 "\n",
@@ -468,6 +428,7 @@
                 "    \"passages\": document_passages,\n",
                 "    \"colbert\": {\"blocks\": binarize_token_vectors_hex(document_token_vectors)},\n",
                 "}\n",
+                "# synchrounous feed (this is blocking and slow, but few docs..)\n",
                 "with app.syncio() as sync:\n",
                 "    response: VespaResponse = sync.feed_data_point(\n",
                 "        data_id=1, fields=vespa_feed_format, schema=\"doc\"\n",
@@ -479,7 +440,7 @@
             "id": "cebada8d",
             "metadata": {},
             "source": [
-                "## Querying\n"
+                "### Querying Vespa with colbert tensors \n"
             ]
         },
         {
@@ -3061,7 +3022,7 @@
             "name": "python",
             "nbconvert_exporter": "python",
             "pygments_lexer": "ipython3",
-            "version": "3.11.4"
+            "version": "3.12.4"
         },
         "vscode": {
             "interpreter": {
@@ -3071,4 +3032,4 @@
     },
     "nbformat": 4,
     "nbformat_minor": 5
-}
+}