-
Notifications
You must be signed in to change notification settings - Fork 34
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #479 from vespa-engine/kkraune/app-packages
Kkraune/app packages
- Loading branch information
Showing
2 changed files
with
320 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,319 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e05d0811", | ||
"metadata": {}, | ||
"source": [ | ||
"![Vespa logo](https://vespa.ai/assets/vespa-logo-color.png)\n", | ||
"\n", | ||
"# Application packages\n", | ||
"\n", | ||
"Vespa is configured using an [application package](https://docs.vespa.ai/en/application-packages.html).\n", | ||
"Pyvespa provides an API to generate a deployable application package.\n", | ||
"\n", | ||
"An application package has at a minimum a [schema](https://docs.vespa.ai/en/schemas.html)\n", | ||
"and [services.xml](https://docs.vespa.ai/en/reference/services.html).\n", | ||
"\n", | ||
"Example - create an empty application package:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "7e3477a6", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from vespa.package import ApplicationPackage\n", | ||
"\n", | ||
"app_package = ApplicationPackage(name=\"myschema\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e3f1e7d5", | ||
"metadata": {}, | ||
"source": [ | ||
"To inspect an application package, dump it to disk using\n", | ||
"[to_files](https://pyvespa.readthedocs.io/en/latest/reference-api.html#vespa.package.ApplicationPackage.to_files):" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "d05523a8", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import tempfile, os\n", | ||
"\n", | ||
"temp_dir = tempfile.TemporaryDirectory()\n", | ||
"os.environ[\"TMP_APP_DIR\"] = temp_dir.name\n", | ||
"app_package.to_files(temp_dir.name)\n", | ||
"print(temp_dir.name)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "e3a4dc05", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"./services.xml\r\n", | ||
"./schemas/myschema.sd\r\n", | ||
"./search/query-profiles/types/root.xml\r\n", | ||
"./search/query-profiles/default.xml\r\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"!cd $TMP_APP_DIR && find . -type f" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "c038d33a", | ||
"metadata": {}, | ||
"source": [ | ||
"Ignore these files for now:\n", | ||
"\n", | ||
" ./search/query-profiles/types/root.xml\n", | ||
" ./search/query-profiles/default.xml" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "7b01cd09", | ||
"metadata": {}, | ||
"source": [ | ||
"## Schema\n", | ||
"\n", | ||
"Use a schema to Create fields, fieldsets and a ranking function - dump the empty schema (An empty schema is created, with the same name as the application package):" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "923edec8", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"schema myschema {\r\n", | ||
" document myschema {\r\n", | ||
" }\r\n", | ||
"}" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"!cat $TMP_APP_DIR/schemas/myschema.sd" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "5a1cbaf2", | ||
"metadata": {}, | ||
"source": [ | ||
"Add fields, a fieldset and a ranking function:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "c83c1945", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from vespa.package import Field, FieldSet, RankProfile\n", | ||
"\n", | ||
"app_package.schema.add_fields(\n", | ||
" Field(name = \"id\", type = \"string\", indexing = [\"attribute\", \"summary\"]),\n", | ||
" Field(name = \"title\", type = \"string\", indexing = [\"index\", \"summary\"], index = \"enable-bm25\"),\n", | ||
" Field(name = \"body\", type = \"string\", indexing = [\"index\", \"summary\"], index = \"enable-bm25\")\n", | ||
")\n", | ||
"\n", | ||
"app_package.schema.add_field_set(\n", | ||
" FieldSet(name = \"default\", fields = [\"title\", \"body\"])\n", | ||
")\n", | ||
"\n", | ||
"app_package.schema.add_rank_profile(\n", | ||
" RankProfile(name = \"default\", first_phase = \"bm25(title) + bm25(body)\")\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "f721bdfd", | ||
"metadata": {}, | ||
"source": [ | ||
"Dump application package again, show schema:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "4fcd3de2", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"schema myschema {\r\n", | ||
" document myschema {\r\n", | ||
" field id type string {\r\n", | ||
" indexing: attribute | summary\r\n", | ||
" }\r\n", | ||
" field title type string {\r\n", | ||
" indexing: index | summary\r\n", | ||
" index: enable-bm25\r\n", | ||
" }\r\n", | ||
" field body type string {\r\n", | ||
" indexing: index | summary\r\n", | ||
" index: enable-bm25\r\n", | ||
" }\r\n", | ||
" }\r\n", | ||
" fieldset default {\r\n", | ||
" fields: title, body\r\n", | ||
" }\r\n", | ||
" rank-profile default {\r\n", | ||
" first-phase {\r\n", | ||
" expression: bm25(title) + bm25(body)\r\n", | ||
" }\r\n", | ||
" }\r\n", | ||
"}" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"app_package.to_files(temp_dir.name)\n", | ||
"!cat $TMP_APP_DIR/schemas/myschema.sd" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "7cc78157", | ||
"metadata": {}, | ||
"source": [ | ||
"Note how the indexing settings are written to the schema.\n", | ||
"\n", | ||
"> **_Pyvespa generally does not support all indexing options in Vespa - it is made for easy experimentation.\n", | ||
" To configure setting an unsupported indexing option (or any other unsupported option),\n", | ||
" dump the application package, modify the schema file\n", | ||
" and deploy the application package from the directory, or as a zipped file.\n", | ||
" [Read more](https://pyvespa.readthedocs.io/en/latest/deploy-docker.html)_**" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "cfd73872", | ||
"metadata": {}, | ||
"source": [ | ||
"At this point, review the Vespa documentation:\n", | ||
"* [field](https://docs.vespa.ai/en/schemas.html#field)\n", | ||
"* [fieldset](https://docs.vespa.ai/en/schemas.html#fieldset)\n", | ||
"* [rank-profile](https://docs.vespa.ai/en/ranking.html#rank-profiles)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "a51353a4", | ||
"metadata": {}, | ||
"source": [ | ||
"## Services\n", | ||
"\n", | ||
"In `services.xml` you will find a container and content cluster -\n", | ||
"see the [Vespa Overview](https://docs.vespa.ai/en/overview.html).\n", | ||
"This is a file you will normally not change or need to know much about - dump the default file:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "4abae84e", | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\r\n", | ||
"<services version=\"1.0\">\r\n", | ||
" <container id=\"myschema_container\" version=\"1.0\">\r\n", | ||
" <search></search>\r\n", | ||
" <document-api></document-api>\r\n", | ||
" </container>\r\n", | ||
" <content id=\"myschema_content\" version=\"1.0\">\r\n", | ||
" <redundancy reply-after=\"1\">1</redundancy>\r\n", | ||
" <documents>\r\n", | ||
" <document type=\"myschema\" mode=\"index\"></document>\r\n", | ||
" </documents>\r\n", | ||
" <nodes>\r\n", | ||
" <node distribution-key=\"0\" hostalias=\"node1\"></node>\r\n", | ||
" </nodes>\r\n", | ||
" </content>\r\n", | ||
"</services>" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"!cat $TMP_APP_DIR/services.xml" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "d6477c44", | ||
"metadata": {}, | ||
"source": [ | ||
"Observe:\n", | ||
"* A content cluster (this is where the index is stored) called `myschema_content` is created.\n", | ||
" This is information not normally needed, unless using\n", | ||
" [delete_all_docs](https://pyvespa.readthedocs.io/en/latest/reference-api.html#vespa.application.Vespa.delete_all_docs)\n", | ||
" to quickly remove all documents from a schema\n", | ||
"\n", | ||
"Remove the temporary application package file dump:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "84ce16e8", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"temp_dir.cleanup()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e242ac80", | ||
"metadata": {}, | ||
"source": [ | ||
"## Next step: Deploy, feed and query\n", | ||
"\n", | ||
"Once the schema is ready for deployment, decide deployment option and deploy the application package:\n", | ||
"* [Deploy to local container](https://pyvespa.readthedocs.io/en/latest/deploy-docker.html)\n", | ||
"* [Deploy to Vespa Cloud](https://pyvespa.readthedocs.io/en/latest/deploy-vespa-cloud.html)\n", | ||
"\n", | ||
"Use the guides on the pyvespa site to feed and query data." | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "python3", | ||
"language": "python", | ||
"name": "python3" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -923,4 +923,4 @@ | |
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} | ||
} |