-
Notifications
You must be signed in to change notification settings - Fork 34
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #229 from vespa-engine/tgm/add-text-image-use-case
Update documentation
- Loading branch information
Showing
9 changed files
with
372 additions
and
31 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file added
BIN
+26.3 KB
docs/sphinx/source/use_cases/image_search/clip-evaluation-boxplot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
365 changes: 365 additions & 0 deletions
365
docs/sphinx/source/use_cases/image_search/image-search-scratch.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,365 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "unauthorized-sentence", | ||
"metadata": {}, | ||
"source": [ | ||
"# Image search\n", | ||
"> Define a text to image search application" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "initial-height", | ||
"metadata": {}, | ||
"source": [ | ||
"This page will walk through the pyvespa code that was used to create the [text to image search sample application](https://github.com/vespa-engine/sample-apps/tree/master/text-image-search/src/python). " | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "trying-jacksonville", | ||
"metadata": {}, | ||
"source": [ | ||
"![SegmentLocal](demo.gif \"segment\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "median-brown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Create the application package" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "interracial-scientist", | ||
"metadata": {}, | ||
"source": [ | ||
"The first step is to create an application package instance named `image_search`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"id": "inner-minimum", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from vespa.package import ApplicationPackage\n", | ||
"\n", | ||
"app_package = ApplicationPackage(name=\"image_search\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "gentle-mountain", | ||
"metadata": {}, | ||
"source": [ | ||
"Add a field to hold the name of the image file. This is used in the sample app to load the final images that should be displayed to the end user. \n", | ||
"\n", | ||
"The `summary` indexing ensures this field is returned as part of the query response. The `attribute` indexing store the fields in memory as an attribute for sorting, querying, and grouping." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "micro-oxygen", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from vespa.package import Field\n", | ||
"\n", | ||
"app_package.schema.add_fields(\n", | ||
" Field(name=\"image_file_name\", type=\"string\", indexing=[\"summary\", \"attribute\"]),\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "historic-explorer", | ||
"metadata": {}, | ||
"source": [ | ||
"Add a field to hold an image embedding. The embeddings are usually generated by a ML model. We can add multiple embedding fields to our application. This is useful when making experiments. For example, the sample app adds 6 image embeddings, one for each of the six pre-trained CLIP models available at the time.\n", | ||
"\n", | ||
"In the example below, the embedding vector has size `512` and is of type `float`. The `index` is required to enable [approximate matching](https://docs.vespa.ai/en/approximate-nn-hnsw.html) and the `HNSW` instance configure the HNSW index. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "twenty-montgomery", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from vespa.package import HNSW\n", | ||
"\n", | ||
"app_package.schema.add_fields(\n", | ||
" Field(\n", | ||
" name=\"embedding_image\",\n", | ||
" type=\"tensor<float>(x[512])\",\n", | ||
" indexing=[\"attribute\", \"index\"],\n", | ||
" ann=HNSW(\n", | ||
" distance_metric=\"angular\",\n", | ||
" max_links_per_node=16,\n", | ||
" neighbors_to_explore_at_insert=500,\n", | ||
" ),\n", | ||
" )\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "instant-fluid", | ||
"metadata": {}, | ||
"source": [ | ||
"Adds a rank profile that ranks the images by how close the image embedding vector is from the query embedding vector." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "systematic-manitoba", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from vespa.package import RankProfile\n", | ||
"\n", | ||
"app_package.schema.add_rank_profile(\n", | ||
" RankProfile(\n", | ||
" name=\"embedding_similarity\",\n", | ||
" inherits=\"default\",\n", | ||
" first_phase=\"closeness(embedding_image)\",\n", | ||
" )\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "engaging-return", | ||
"metadata": {}, | ||
"source": [ | ||
"The tensors used in queries must have their type declared in a query profile in the application package. The code below declares the text embedding that will be sent through the Vespa query. It has the same size and type of the image embedding." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"id": "divine-legislature", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from vespa.package import QueryTypeField\n", | ||
"\n", | ||
"app_package.query_profile_type.add_fields(\n", | ||
" QueryTypeField(\n", | ||
" name=\"ranking.features.query(embedding_text)\",\n", | ||
" type=\"tensor<float>(x[512])\",\n", | ||
" )\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "controlled-playing", | ||
"metadata": {}, | ||
"source": [ | ||
"## Deploy the application" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "liked-market", | ||
"metadata": {}, | ||
"source": [ | ||
"The application package created above can be deployed using [Docker](https://pyvespa.readthedocs.io/en/latest/deploy-vespa-docker.html) or [Vespa Cloud](https://pyvespa.readthedocs.io/en/latest/deploy-vespa-cloud.html). Follow the instructions based on the desired deployment mode. Either option will create a Vespa connection instance that can be stored in a variable that will be denoted here as `app`.\n", | ||
"\n", | ||
"We can then use `app` to interact with the deployed application." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "overhead-record", | ||
"metadata": {}, | ||
"source": [ | ||
"## Feed the image data" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "recovered-championship", | ||
"metadata": {}, | ||
"source": [ | ||
"To feed the image data: " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "under-paste", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"responses = app.feed_batch(batch)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "adequate-edinburgh", | ||
"metadata": {}, | ||
"source": [ | ||
"where `batch` is a list of dictionaries like the one below:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "experimental-heritage", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"{\n", | ||
" \"id\": \"dog1\",\n", | ||
" \"fields\": {\n", | ||
" \"image_file_name\": \"dog1.jpg\",\n", | ||
" \"embedding_image\": {\"values\": [0.884, -0.345, ..., 0.326]},\n", | ||
" }\n", | ||
"}" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "sharing-health", | ||
"metadata": {}, | ||
"source": [ | ||
"One of the advantages of having a python API is that it can integrate with commonly used ML frameworks. The sample application [show how to create a PyTorch DataLoader](https://github.com/vespa-engine/sample-apps/blob/master/text-image-search/src/python/embedding.py#L85-L113) to generate batches of image data by using CLIP models to generate image embeddings." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "educational-danish", | ||
"metadata": {}, | ||
"source": [ | ||
"## Query the application" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "continental-student", | ||
"metadata": {}, | ||
"source": [ | ||
"The following query will use approximate nearest neighbor search to match the closest images to the query text and rank the images according to their distance to the query text. The sample application used CLIP models to generate image and query embeddings." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "large-switch", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"response = app.query(body={\n", | ||
" \"yql\": 'select * from sources * where ([{\"targetNumHits\":100}]nearestNeighbor(embedding_image,embedding_text));',\n", | ||
" \"hits\": 100,\n", | ||
" \"ranking.features.query(embedding_text)\": [0.632, -0.987, ..., 0.534],\n", | ||
" \"ranking.profile\": \"embedding_similarity\"\n", | ||
"})" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "incoming-hollywood", | ||
"metadata": {}, | ||
"source": [ | ||
"## Evaluate different query models" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "governing-morgan", | ||
"metadata": {}, | ||
"source": [ | ||
"Define metrics to evaluate:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "elder-tower", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from vespa.evaluation import MatchRatio, Recall, ReciprocalRank\n", | ||
"\n", | ||
"eval_metrics = [\n", | ||
" MatchRatio(), \n", | ||
" Recall(at=100), \n", | ||
" ReciprocalRank(at=100)\n", | ||
"]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "caroline-devices", | ||
"metadata": {}, | ||
"source": [ | ||
"The sample application illustrates how to evaluate different CLIP models through the `evaluate` method:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "functional-stand", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"result = app.evaluate(\n", | ||
" labeled_data=labeled_data, # Labeled data to define which images should be returned to a given query\n", | ||
" eval_metrics=eval_metrics, # Metrics used\n", | ||
" query_model=query_models, # Each query model uses a different CLIP model version\n", | ||
" id_field=\"image_file_name\", # The name of the id field used by the labeled data to identify the image\n", | ||
" per_query=True # Return results per query rather the aggragated.\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "level-colors", | ||
"metadata": {}, | ||
"source": [ | ||
"The figure below is the reciprocal rank at 100 computed based on the output of the `evaluate` method." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "canadian-gambling", | ||
"metadata": {}, | ||
"source": [ | ||
"![evaluation](clip-evaluation-boxplot.png)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.9.7" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Oops, something went wrong.