Merge pull request #909 from vespa-engine/jobergum/refactor-complex-c…

…olpali Jobergum/refactor complex colpali
vespa-engine · Sep 17, 2024 · f3e5049 · f3e5049
2 parents 86a3d4b + bac8832
commit f3e5049
Show file tree

Hide file tree

Showing 2 changed files with 51 additions and 119 deletions.
diff --git a/.github/workflows/notebooks-cloud.yml b/.github/workflows/notebooks-cloud.yml
@@ -67,6 +67,7 @@ jobs:
           VESPA_CLOUD_SECRET_TOKEN: ${{ secrets.VESPA_CLOUD_SECRET_TOKEN }}
           CO_API_KEY: ${{ secrets.CO_API_KEY }}
           OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
+          GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
         run: |
           echo "Running ${{ matrix.notebook }}"
 

diff --git a/...nt-retrieval-vision-language-models.ipynb → ...rieval-vision-language-models-cloud.ipynb b/...nt-retrieval-vision-language-models.ipynb → ...rieval-vision-language-models-cloud.ipynb
@@ -43,6 +43,8 @@
                 "where we represent the colbert embeddings per document with the tensor `tensor(page{}, patch{}, v[128])`. This enables \n",
                 "us to use the PDF as the document (retrievable unit), storing the page embeddings in the same document.\n",
                 "\n",
+                "For a simpler example where we use one vespa document = One PDF page, see [this notebook](simplified-retrieval-with-colpali-vlm_Vespa-cloud.ipynb).\n",
+                "\n",
                 "We also store the base64 encoded image, and page meta data like title and url so that we can display it in the result page, but also\n",
                 "use it for RAG with powerful LLMs with vision capabilities. \n",
                 "\n",
@@ -51,10 +53,32 @@
                 "\n",
                 "Let us get started. \n",
                 "\n",
+                "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/vespa-engine/pyvespa/blob/master/docs/sphinx/source/examples/colpali-document-retrieval-vision-language-models.ipynb)\n",
+                "\n",
+                "\n",
+                "Install dependencies: \n",
+                "\n",
+                "Note that the python pdf2image package requires poppler-utils, see other installation options [here](https://pdf2image.readthedocs.io/en/latest/installation.html#installing-poppler).\n",
                 "\n",
                 "Install dependencies: "
             ]
         },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "!sudo apt-get install poppler-utils -y"
+            ]
+        },
+        {
+            "cell_type": "markdown",
+            "metadata": {},
+            "source": [
+                "Install python packages "
+            ]
+        },
         {
             "cell_type": "code",
             "execution_count": null,
@@ -63,7 +87,7 @@
             },
             "outputs": [],
             "source": [
-                "!pip3 install git+https://github.com/ManuelFay/colpali pdf2image pypdf pyvespa vespacli"
+                "!pip3 install colpali-engine==0.2.0 pdf2image pypdf pyvespa vespacli requests"
             ]
         },
         {
@@ -305,10 +329,10 @@
             },
             "outputs": [],
             "source": [
-                "model_name = \"vidore/colpali\"\n",
-                "model = ColPali.from_pretrained(\"google/paligemma-3b-mix-448\", torch_dtype=type).eval()\n",
+                "model_name = \"vidore/colpali-v1.2\"\n",
+                "model = ColPali.from_pretrained(\"vidore/colpaligemma-3b-mix-448-base\", torch_dtype=type).eval()\n",
                 "model.load_adapter(model_name)\n",
-                "model.to(device)\n",
+                "model = model.eval()\n",
                 "processor = AutoProcessor.from_pretrained(model_name)"
             ]
         },
@@ -746,119 +770,20 @@
                 "colbert_schema.add_rank_profile(colbert_profile)"
             ]
         },
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "## Deploy the application to Vespa Cloud\n",
-                "\n",
-                "With the configured application, we can deploy it to [Vespa Cloud](https://cloud.vespa.ai/en/).\n",
-                "\n"
-            ]
-        },
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "To deploy the application to Vespa Cloud we need to create a tenant in the Vespa Cloud:\n",
-                "\n",
-                "Create a tenant at [console.vespa-cloud.com](https://console.vespa-cloud.com/) (unless you already have one).\n",
-                "This step requires a Google or GitHub account, and will start your [free trial](https://cloud.vespa.ai/en/free-trial).\n",
-                "Make note of the tenant name, it is used in the next steps.\n"
-            ]
-        },
-        {
-            "cell_type": "code",
-            "execution_count": null,
-            "metadata": {},
-            "outputs": [],
-            "source": [
-                "import os\n",
-                "\n",
-                "os.environ[\"TENANT_NAME\"] = \"samples\"  # Replace with your tenant name\n",
-                "\n",
-                "vespa_cli_command = (\n",
-                "    f'vespa config set application {os.environ[\"TENANT_NAME\"]}.{vespa_app_name}'\n",
-                ")\n",
-                "\n",
-                "!vespa config set target cloud\n",
-                "!{vespa_cli_command}\n",
-                "!vespa auth cert -N"
-            ]
-        },
         {
             "cell_type": "markdown",
             "metadata": {},
             "source": [
                 "Validate that certificates are ok and deploy the application to Vespa Cloud."
             ]
         },
-        {
-            "cell_type": "code",
-            "execution_count": 43,
-            "metadata": {},
-            "outputs": [],
-            "source": [
-                "from os.path import exists\n",
-                "from pathlib import Path\n",
-                "\n",
-                "cert_path = (\n",
-                "    Path.home()\n",
-                "    / \".vespa\"\n",
-                "    / f\"{os.environ['TENANT_NAME']}.{vespa_app_name}.default/data-plane-public-cert.pem\"\n",
-                ")\n",
-                "key_path = (\n",
-                "    Path.home()\n",
-                "    / \".vespa\"\n",
-                "    / f\"{os.environ['TENANT_NAME']}.{vespa_app_name}.default/data-plane-private-key.pem\"\n",
-                ")\n",
-                "\n",
-                "if not exists(cert_path) or not exists(key_path):\n",
-                "    print(\n",
-                "        \"ERROR: set the correct paths to security credentials. Correct paths above and rerun until you do not see this error\"\n",
-                "    )"
-            ]
-        },
-        {
-            "cell_type": "markdown",
-            "metadata": {},
-            "source": [
-                "Note that the subsequent Vespa Cloud deploy call below will add `data-plane-public-cert.pem` to the application before deploying it to Vespa Cloud, so that\n",
-                "you have access to both the private key and the public certificate. At the same time, Vespa Cloud only knows the public certificate.\n",
-                "\n",
-                "### Configure Vespa Cloud control-plane security\n",
-                "\n",
-                "Authenticate to generate a tenant level control plane API key for deploying the applications to Vespa Cloud, and save the path to it.\n",
-                "\n",
-                "The generated tenant api key must be added in the Vespa Console before attempting to deploy the application.\n",
-                "\n",
-                "```\n",
-                "To use this key in Vespa Cloud click 'Add custom key' at\n",
-                "https://console.vespa-cloud.com/tenant/TENANT_NAME/account/keys\n",
-                "and paste the entire public key including the BEGIN and END lines.\n",
-                "```\n"
-            ]
-        },
-        {
-            "cell_type": "code",
-            "execution_count": null,
-            "metadata": {},
-            "outputs": [],
-            "source": [
-                "!vespa auth api-key\n",
-                "\n",
-                "from pathlib import Path\n",
-                "\n",
-                "api_key_path = Path.home() / \".vespa\" / f\"{os.environ['TENANT_NAME']}.api-key.pem\""
-            ]
-        },
         {
             "cell_type": "markdown",
             "metadata": {},
             "source": [
                 "### Deploy to Vespa Cloud\n",
                 "\n",
-                "Now that we have data-plane and control-plane credentials ready, we can deploy our application to Vespa Cloud!\n",
+                "With the configured application, we can deploy it to [Vespa Cloud](https://cloud.vespa.ai/en/).\n",
                 "\n",
                 "`PyVespa` supports deploying apps to the [development zone](https://cloud.vespa.ai/en/reference/environments#dev-and-perf).\n",
                 "\n",
@@ -872,23 +797,19 @@
             "outputs": [],
             "source": [
                 "from vespa.deployment import VespaCloud\n",
+                "import os\n",
                 "\n",
+                "# Replace with your tenant name from the Vespa Cloud Console\n",
+                "tenant_name = \"vespa-team\" \n",
                 "\n",
-                "def read_secret():\n",
-                "    \"\"\"Read the API key from the environment variable. This is\n",
-                "    only used for CI/CD purposes.\"\"\"\n",
-                "    t = os.getenv(\"VESPA_TEAM_API_KEY\")\n",
-                "    if t:\n",
-                "        return t.replace(r\"\\n\", \"\\n\")\n",
-                "    else:\n",
-                "        return t\n",
-                "\n",
+                "key = os.getenv(\"VESPA_TEAM_API_KEY\", None)\n",
+                "if key is not None:\n",
+                "    key = key.replace(r\"\\n\", \"\\n\")  # To parse key correctly\n",
                 "\n",
                 "vespa_cloud = VespaCloud(\n",
-                "    tenant=os.environ[\"TENANT_NAME\"],\n",
+                "    tenant=tenant_name,\n",
                 "    application=vespa_app_name,\n",
-                "    key_content=read_secret() if read_secret() else None,\n",
-                "    key_location=api_key_path,\n",
+                "    key_content=key,  # Key is only used for CI/CD testing of this notebook. Can be removed if logging in interactively\n",
                 "    application_package=vespa_application_package,\n",
                 ")"
             ]
@@ -1128,7 +1049,8 @@
                 "        yql=\"select title,url,images from doc where userInput(@userQuery)\",\n",
                 "        ranking=\"default\",\n",
                 "        userQuery=query,\n",
-                "        timeout=1,\n",
+                "        timeout=2,\n",
+                "        hits=3,\n",
                 "        body={\n",
                 "            \"presentation.format.tensors\": \"short-value\",\n",
                 "            \"input.query(qt)\": float_query_token_vectors(qs[idx]),\n",
@@ -1151,7 +1073,16 @@
                 "\n",
                 "We will use the [Gemini Flash](https://deepmind.google/technologies/gemini/flash/) model for reading and answering. \n",
                 "\n",
-                "In the following, we input the best matching PDF _page_ image and the question.\n"
+                "In the following, we input the best matching PDF _page_ image and the question. \n"
+            ]
+        },
+        {
+            "cell_type": "code",
+            "execution_count": null,
+            "metadata": {},
+            "outputs": [],
+            "source": [
+                "!pip3 install google-generativeai"
             ]
         },
         {
@@ -6440,4 +6371,4 @@
     },
     "nbformat": 4,
     "nbformat_minor": 4
-}
+}