From 7603fb11a1b3e2d58b1fa4a98d78539e44fc4eb1 Mon Sep 17 00:00:00 2001
From: Kristian Aune <kkraune@users.noreply.github.com>
Date: Thu, 19 Dec 2024 11:36:03 +0100
Subject: [PATCH] fix typos

---
 ...olpali-benchmark-vqa-vlm_Vespa-cloud.ipynb | 20 +++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/docs/sphinx/source/examples/colpali-benchmark-vqa-vlm_Vespa-cloud.ipynb b/docs/sphinx/source/examples/colpali-benchmark-vqa-vlm_Vespa-cloud.ipynb
index 483229b8..c793823c 100644
--- a/docs/sphinx/source/examples/colpali-benchmark-vqa-vlm_Vespa-cloud.ipynb
+++ b/docs/sphinx/source/examples/colpali-benchmark-vqa-vlm_Vespa-cloud.ipynb
@@ -17,9 +17,9 @@
     "\n",
     "This notebook demonstrates how to reproduce the ColPali results on [DocVQA](https://huggingface.co/datasets/vidore/docvqa_test_subsampled) with Vespa. The dataset consists of PDF documents with questions and answers. \n",
     "\n",
-    "We demonstrate how we can binarize the patch embeddings and replace the float float MaxSim scoring with a `hamming` based MaxSim without much loss in ranking accuracy but with a significant speedup (close to 4x) and reduce the memory (and storage) requirements by 32x.\n",
+    "We demonstrate how we can binarize the patch embeddings and replace the float MaxSim scoring with a `hamming` based MaxSim without much loss in ranking accuracy but with a significant speedup (close to 4x) and reducing the memory (and storage) requirements by 32x.\n",
     "\n",
-    "In this notebook we represent one PDF page as one vespa document. See other notebooks for more information about using ColPali with Vespa:\n",
+    "In this notebook, we represent one PDF page as one vespa document. See other notebooks for more information about using ColPali with Vespa:\n",
     "\n",
     "- [Scaling ColPALI (VLM) Retrieval](simplified-retrieval-with-colpali-vlm_Vespa-cloud.ipynb)\n",
     "- [Vespa 🤝 ColPali: Efficient Document Retrieval with Vision Language Models](colpali-document-retrieval-vision-language-models-cloud.ipynb)\n",
@@ -405,7 +405,7 @@
    "metadata": {},
    "source": [
     "Now we have all the embeddings. We'll define two helper functions to perform binarization (BQ) and also packing float values\n",
-    "to shorter hex representation in JSON. Both saves bandwidth and improves feed performance. "
+    "to shorter hex representation in JSON. Both save bandwidth and improve feed performance. "
    ]
   },
   {
@@ -456,7 +456,7 @@
     "### Patch Vector pooling\n",
     "\n",
     "This reduces the number of patch embeddings by a factor of 3, meaning that we go from 1030 patch vectors to 343 patch vectors. This reduces\n",
-    "both the memory and the number of dotproducts that we need to calculate. This function is not in use in this notebook, but it is included for reference."
+    "both the memory and the number of dotproducts we need to calculate. This function is not in use in this notebook, but it is included for reference."
    ]
   },
   {
@@ -515,7 +515,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Create the Vespa feed format, we use hex formats for mixed tensors [doc](https://docs.vespa.ai/en/reference/document-json-format.html#tensor).\n"
+    "Create the Vespa feed format. We use hex formats for mixed tensors [doc](https://docs.vespa.ai/en/reference/document-json-format.html#tensor).\n"
    ]
   },
   {
@@ -551,7 +551,7 @@
     "A Vespa application package consists of configuration files, schemas, models, and code (plugins).\n",
     "\n",
     "First, we define a [Vespa schema](https://docs.vespa.ai/en/schemas.html) with the fields we want to store and their type. This is a simple\n",
-    "schema which is all we need to evaluate effectiveness of the model."
+    "schema, which is all we need to evaluate the effectiveness of the model."
    ]
   },
   {
@@ -619,7 +619,7 @@
     "\n",
     "colpali_profile = RankProfile(\n",
     "    name=\"float-float\",\n",
-    "    # We define both the float and binary query inputs here, the rest of the profiles inherits these inputs\n",
+    "    # We define both the float and binary query inputs here; the rest of the profiles inherit these inputs\n",
     "    inputs=[\n",
     "        (\"query(qtb)\", \"tensor<int8>(querytoken{}, v[16])\"),\n",
     "        (\"query(qt)\", \"tensor<float>(querytoken{}, v[128])\"),\n",
@@ -863,7 +863,7 @@
    "metadata": {},
    "source": [
     "A simple routine for querying Vespa. Note that we send both vector representations in the query independently\n",
-    "of the ranking method used, this for simplicity. Not all the ranking models we evaluate needs both representations. "
+    "of the ranking method used, this for simplicity. Not all the ranking models we evaluate need both representations. "
    ]
   },
   {
@@ -1009,7 +1009,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "This is encouraging as the binary-binary representation is 4x faster than the float-float representation and saves 32x space. We can also largely retain the effectiveness of the float-binary representation by using the phased approach where we re-rank the top 20 pages from the hamming (binary-binary) version using the float-binary representation. Now we can explore the ranking depth and see how the phased approach performs with different ranking depths."
+    "This is encouraging as the binary-binary representation is 4x faster than the float-float representation and saves 32x space. We can also largely retain the effectiveness of the float-binary representation by using the phased approach, where we re-rank the top 20 pages from the hamming (binary-binary) version using the float-binary representation. Now we can explore the ranking depth and see how the phased approach performs with different ranking depths."
    ]
   },
   {
@@ -1072,7 +1072,7 @@
    "metadata": {},
    "source": [
     "### Conclusion\n",
-    "The binary representation of the patch embeddings reduces the storage by 32x, and using hamming distance instead of dotproduc saves us about 4x in computation compared to the float-float model or the float-binary model (which only saves storage).  Using a re-ranking step with only depth 10, we can improve the effectiveness of the binary-binary model to almost match the float-float MaxSim model. The additional re-ranking step only requires that we pass also the float query embedding version without any additional storage overhead. \n",
+    "The binary representation of the patch embeddings reduces the storage by 32x, and using hamming distance instead of dotproduct saves us about 4x in computation compared to the float-float model or the float-binary model (which only saves storage). Using a re-ranking step with only depth 10, we can improve the effectiveness of the binary-binary model to almost match the float-float MaxSim model. The additional re-ranking step only requires that we pass also the float query embedding version without any additional storage overhead. \n",
     " "
    ]
   },