diff --git a/retrieval-augmented-generation/README.md b/retrieval-augmented-generation/README.md index 45cf0c876..89f6c31d5 100644 --- a/retrieval-augmented-generation/README.md +++ b/retrieval-augmented-generation/README.md @@ -127,11 +127,12 @@ as a secret. The `services.xml` file must refer to the newly added secret in the secret store. Replace `` and `` below with your own values: -
-    <secrets>
-      <openai-api-key vault="<my-vault-name>" name="<my-secret-name>"/>
-    </secrets>
-
+ +```xml + + + +``` Configure the vespa client. Replace `tenant-name` below with your tenant name. We use the application name `rag-app` here, but you are free to choose your own @@ -141,19 +142,38 @@ $ vespa config set target cloud $ vespa config set application tenant-name.rag-app -Authorize Vespa Cloud access and add your public certificates to the application: +Log in and add your public certificates to the application for Dataplane access:
 $ vespa auth login
 $ vespa auth cert
 
-Deploy the application. This can take some time for all nodes to be provisioned: +Assign application access to the secret. +Applications must be created first, so one can use the Vespa Cloud Console to grant access. +The easiest way to do this is to do a deployment, which will auto-create the application. +The first deployment will fail: +
 $ vespa deploy --wait 900
 
-Now the application should be deployed! You can continue to the -[querying](#querying) section below for testing this application. +``` +[09:47:43] warning Deployment failed: Invalid application: Vault 'my_vault' does not exist, +or application does not have access to it +``` + +At this point, open the console +(the link is like https://console.vespa-cloud.com/tenant/mytenant/account/secrets) +and assign access: + +![edit application access dialog](/ext/edit-app-access.png) + +Deploy the application again. This can take some time for all nodes to be provisioned: +
+$ vespa deploy --wait 900
+
+ +Now the application should be deployed! ## Querying @@ -184,14 +204,17 @@ $ vespa query \ traceLevel=1 -On Vespa cloud, just skip the `--header` parameter, as the API key is already -set up in the services.xml file, and will be retrieved from the Vespa secret -store. - -Here, we specifically set the search chain to `openai`. This calls the -`RAGSearcher` which is set up to use the `OpenAI` client. Note that this -requires an OpenAI API key, which is sent in the header. We also add a timeout -as token generation can take some time. +On Vespa cloud, just skip the `--header` parameter, +as the API key is already set up in [services.xml](services.xml), +and will be retrieved from the Vespa secret store. + +Here, we specifically set the search chain to `openai`. +This calls the +[RAGSearcher](https://github.com/vespa-engine/vespa/blob/master/container-search/src/main/java/ai/vespa/search/llm/RAGSearcher.java) +which is set up to use the +[OpenAI](https://github.com/vespa-engine/vespa/blob/master/model-integration/src/main/java/ai/vespa/llm/clients/OpenAI.java) client. +Note that this requires an OpenAI API key. +We also add a timeout as token generation can take some time. #### Local diff --git a/retrieval-augmented-generation/ext/edit-app-access.png b/retrieval-augmented-generation/ext/edit-app-access.png new file mode 100644 index 000000000..bc41193b8 Binary files /dev/null and b/retrieval-augmented-generation/ext/edit-app-access.png differ