Skip to content

Commit

Permalink
Merge pull request #479 from vespa-engine/kkraune/app-packages
Browse files Browse the repository at this point in the history
Kkraune/app packages
  • Loading branch information
kkraune authored Feb 28, 2023
2 parents a016793 + 28c9297 commit b091374
Show file tree
Hide file tree
Showing 2 changed files with 320 additions and 1 deletion.
319 changes: 319 additions & 0 deletions docs/sphinx/source/application-packages.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,319 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e05d0811",
"metadata": {},
"source": [
"![Vespa logo](https://vespa.ai/assets/vespa-logo-color.png)\n",
"\n",
"# Application packages\n",
"\n",
"Vespa is configured using an [application package](https://docs.vespa.ai/en/application-packages.html).\n",
"Pyvespa provides an API to generate a deployable application package.\n",
"\n",
"An application package has at a minimum a [schema](https://docs.vespa.ai/en/schemas.html)\n",
"and [services.xml](https://docs.vespa.ai/en/reference/services.html).\n",
"\n",
"Example - create an empty application package:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7e3477a6",
"metadata": {},
"outputs": [],
"source": [
"from vespa.package import ApplicationPackage\n",
"\n",
"app_package = ApplicationPackage(name=\"myschema\")"
]
},
{
"cell_type": "markdown",
"id": "e3f1e7d5",
"metadata": {},
"source": [
"To inspect an application package, dump it to disk using\n",
"[to_files](https://pyvespa.readthedocs.io/en/latest/reference-api.html#vespa.package.ApplicationPackage.to_files):"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d05523a8",
"metadata": {},
"outputs": [],
"source": [
"import tempfile, os\n",
"\n",
"temp_dir = tempfile.TemporaryDirectory()\n",
"os.environ[\"TMP_APP_DIR\"] = temp_dir.name\n",
"app_package.to_files(temp_dir.name)\n",
"print(temp_dir.name)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e3a4dc05",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"./services.xml\r\n",
"./schemas/myschema.sd\r\n",
"./search/query-profiles/types/root.xml\r\n",
"./search/query-profiles/default.xml\r\n"
]
}
],
"source": [
"!cd $TMP_APP_DIR && find . -type f"
]
},
{
"cell_type": "markdown",
"id": "c038d33a",
"metadata": {},
"source": [
"Ignore these files for now:\n",
"\n",
" ./search/query-profiles/types/root.xml\n",
" ./search/query-profiles/default.xml"
]
},
{
"cell_type": "markdown",
"id": "7b01cd09",
"metadata": {},
"source": [
"## Schema\n",
"\n",
"Use a schema to Create fields, fieldsets and a ranking function - dump the empty schema (An empty schema is created, with the same name as the application package):"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "923edec8",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"schema myschema {\r\n",
" document myschema {\r\n",
" }\r\n",
"}"
]
}
],
"source": [
"!cat $TMP_APP_DIR/schemas/myschema.sd"
]
},
{
"cell_type": "markdown",
"id": "5a1cbaf2",
"metadata": {},
"source": [
"Add fields, a fieldset and a ranking function:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c83c1945",
"metadata": {},
"outputs": [],
"source": [
"from vespa.package import Field, FieldSet, RankProfile\n",
"\n",
"app_package.schema.add_fields(\n",
" Field(name = \"id\", type = \"string\", indexing = [\"attribute\", \"summary\"]),\n",
" Field(name = \"title\", type = \"string\", indexing = [\"index\", \"summary\"], index = \"enable-bm25\"),\n",
" Field(name = \"body\", type = \"string\", indexing = [\"index\", \"summary\"], index = \"enable-bm25\")\n",
")\n",
"\n",
"app_package.schema.add_field_set(\n",
" FieldSet(name = \"default\", fields = [\"title\", \"body\"])\n",
")\n",
"\n",
"app_package.schema.add_rank_profile(\n",
" RankProfile(name = \"default\", first_phase = \"bm25(title) + bm25(body)\")\n",
")"
]
},
{
"cell_type": "markdown",
"id": "f721bdfd",
"metadata": {},
"source": [
"Dump application package again, show schema:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4fcd3de2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"schema myschema {\r\n",
" document myschema {\r\n",
" field id type string {\r\n",
" indexing: attribute | summary\r\n",
" }\r\n",
" field title type string {\r\n",
" indexing: index | summary\r\n",
" index: enable-bm25\r\n",
" }\r\n",
" field body type string {\r\n",
" indexing: index | summary\r\n",
" index: enable-bm25\r\n",
" }\r\n",
" }\r\n",
" fieldset default {\r\n",
" fields: title, body\r\n",
" }\r\n",
" rank-profile default {\r\n",
" first-phase {\r\n",
" expression: bm25(title) + bm25(body)\r\n",
" }\r\n",
" }\r\n",
"}"
]
}
],
"source": [
"app_package.to_files(temp_dir.name)\n",
"!cat $TMP_APP_DIR/schemas/myschema.sd"
]
},
{
"cell_type": "markdown",
"id": "7cc78157",
"metadata": {},
"source": [
"Note how the indexing settings are written to the schema.\n",
"\n",
"> **_Pyvespa generally does not support all indexing options in Vespa - it is made for easy experimentation.\n",
" To configure setting an unsupported indexing option (or any other unsupported option),\n",
" dump the application package, modify the schema file\n",
" and deploy the application package from the directory, or as a zipped file.\n",
" [Read more](https://pyvespa.readthedocs.io/en/latest/deploy-docker.html)_**"
]
},
{
"cell_type": "markdown",
"id": "cfd73872",
"metadata": {},
"source": [
"At this point, review the Vespa documentation:\n",
"* [field](https://docs.vespa.ai/en/schemas.html#field)\n",
"* [fieldset](https://docs.vespa.ai/en/schemas.html#fieldset)\n",
"* [rank-profile](https://docs.vespa.ai/en/ranking.html#rank-profiles)"
]
},
{
"cell_type": "markdown",
"id": "a51353a4",
"metadata": {},
"source": [
"## Services\n",
"\n",
"In `services.xml` you will find a container and content cluster -\n",
"see the [Vespa Overview](https://docs.vespa.ai/en/overview.html).\n",
"This is a file you will normally not change or need to know much about - dump the default file:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4abae84e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n",
"<services version=\"1.0\">\r\n",
" <container id=\"myschema_container\" version=\"1.0\">\r\n",
" <search></search>\r\n",
" <document-api></document-api>\r\n",
" </container>\r\n",
" <content id=\"myschema_content\" version=\"1.0\">\r\n",
" <redundancy reply-after=\"1\">1</redundancy>\r\n",
" <documents>\r\n",
" <document type=\"myschema\" mode=\"index\"></document>\r\n",
" </documents>\r\n",
" <nodes>\r\n",
" <node distribution-key=\"0\" hostalias=\"node1\"></node>\r\n",
" </nodes>\r\n",
" </content>\r\n",
"</services>"
]
}
],
"source": [
"!cat $TMP_APP_DIR/services.xml"
]
},
{
"cell_type": "markdown",
"id": "d6477c44",
"metadata": {},
"source": [
"Observe:\n",
"* A content cluster (this is where the index is stored) called `myschema_content` is created.\n",
" This is information not normally needed, unless using\n",
" [delete_all_docs](https://pyvespa.readthedocs.io/en/latest/reference-api.html#vespa.application.Vespa.delete_all_docs)\n",
" to quickly remove all documents from a schema\n",
"\n",
"Remove the temporary application package file dump:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "84ce16e8",
"metadata": {},
"outputs": [],
"source": [
"temp_dir.cleanup()"
]
},
{
"cell_type": "markdown",
"id": "e242ac80",
"metadata": {},
"source": [
"## Next step: Deploy, feed and query\n",
"\n",
"Once the schema is ready for deployment, decide deployment option and deploy the application package:\n",
"* [Deploy to local container](https://pyvespa.readthedocs.io/en/latest/deploy-docker.html)\n",
"* [Deploy to Vespa Cloud](https://pyvespa.readthedocs.io/en/latest/deploy-vespa-cloud.html)\n",
"\n",
"Use the guides on the pyvespa site to feed and query data."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "python3",
"language": "python",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Original file line number Diff line number Diff line change
Expand Up @@ -923,4 +923,4 @@
},
"nbformat": 4,
"nbformat_minor": 5
}
}

0 comments on commit b091374

Please sign in to comment.