Modify embedding-usvc to support multimodal embedding

- Change embedding-usvc chart to adapt to latest chagnes - Support multimodal embedding Signed-off-by: Lianhao Lu <[email protected]>
opea-project · Jan 15, 2025 · 07b02af · 07b02af
1 parent 7b35326
commit 07b02af
Show file tree

Hide file tree

Showing 25 changed files with 813 additions and 49 deletions.
diff --git a/helm-charts/common/embedding-usvc/.helmignore b/helm-charts/common/embedding-usvc/.helmignore
@@ -21,3 +21,5 @@
 .idea/
 *.tmproj
 .vscode/
+# CI values
+ci*-values.yaml
diff --git a/helm-charts/common/embedding-usvc/Chart.yaml b/helm-charts/common/embedding-usvc/Chart.yaml
@@ -13,3 +13,7 @@ dependencies:
     version: 0-latest
     repository: file://../tei
     condition: tei.enabled
+  - name: mm-embedding
+    version: 0-latest
+    repository: file://../mm-embedding
+    condition: mm-embedding.enabled
diff --git a/helm-charts/common/embedding-usvc/README.md b/helm-charts/common/embedding-usvc/README.md
@@ -1,30 +1,42 @@
 # embedding-usvc
 
-Helm chart for deploying embedding microservice.
+Helm chart for deploying OPEA embedding microservice.
 
-embedding-usvc depends on TEI, set TEI_EMBEDDING_ENDPOINT.
+## Installing the chart
 
-## (Option1): Installing the chart separately
+The OPEA embedding microservice depends on one of the following backend services:
 
-First, you need to install the tei chart, please refer to the [tei](../tei) chart for more information.
+- TEI: please refer to [tei](../tei) chart for more information
 
-After you've deployted the tei chart successfully, please run `kubectl get svc` to get the tei service endpoint, i.e. `http://tei`.
+- multimodal embedding BridgeTower: please refer to [mm-embedding](../mm-embedding) chart for more information.
+
+- prediction guard: please refert to external [Prediction Guard](https://predictionguard.com) for more information.
+
+First, you need to get the dependent service deployed, i.e. deploy the tei helm chart, mm-embedding helm chart, or contact prediction guard to get access info.
+
+After you've deployed the dependent service successfully, please run `kubectl get svc` to get the backend service URL, e.g. `http://tei`, `http://mm-embedding`.
 
 To install the embedding-usvc chart, run the following:
 
 ```console
 cd GenAIInfra/helm-charts/common/embedding-usvc
-export TEI_EMBEDDING_ENDPOINT="http://tei"
 helm dependency update
-helm install embedding-usvc . --set TEI_EMBEDDING_ENDPOINT=${TEI_EMBEDDING_ENDPOINT}
-```
 
-## (Option2): Installing the chart with dependencies automatically
+# Use TEI as the backend(default)
+export EMBEDDING_BACKEND="TEI"
+export EMBEDDING_ENDPOINT="http://tei"
+helm install embedding-usvc . --set EMBEDDING_BACKEND=${EMBEDDING_BACKEND} --set EMBEDDING_ENDPOINT=${EMBEDDING_ENDPOINT}
+
+# Use multimodal embedding BridgeTower as the backend
+# export EMBEDDING_BACKEND="BridgeTower"
+# export EMBEDDING_ENDPOINT="http://mm-embedding"
+# helm install embedding-usvc . --set EMBEDDING_BACKEND=${EMBEDDING_BACKEND} --set EMBEDDING_ENDPOINT=${EMBEDDING_ENDPOINT}
+
+# Use predcition guard as the backend
+# export EMBEDDING_BACKEND="PredictionGuard"
+# export API_KEY=<your PedictionGuard api key>
+# helm install embedding-usvc . --set EMBEDDING_BACKEND=${EMBEDDING_BACKEND} --set PREDICTIONGUARD_API_KEY=${API_KEY}
 
-```console
-cd GenAIInfra/helm-charts/common/embedding-usvc
-helm dependency update
-helm install embedding-usvc . --set tei.enabled=true
 ```
 
 ## Verify
@@ -36,17 +48,24 @@ Then run the command `kubectl port-forward svc/embedding-usvc 6000:6000` to expo
 Open another terminal and run the following command to verify the service if working:
 
 ```console
+# Verify with TEI or prediction guard backend:
+curl http://localhost:6000/v1/embeddings \
+    -X POST \
+    -H 'Content-Type: application/json' \
+    -d '{"input":"What is Deep Learning?"}'
+
+# Verify with multimodal embedding BridgeTower backend:
 curl http://localhost:6000/v1/embeddings \
     -X POST \
-    -d '{"text":"hello"}' \
-    -H 'Content-Type: application/json'
+    -H 'Content-Type: application/json' \
+    -d '{"text": {"text" : "This is some sample text."}, "image" : {"url": "https://github.com/docarray/docarray/blob/main/tests/toydata/image-data/apple.png?raw=true"}}'
 ```
 
 ## Values
 
-| Key                    | Type   | Default                | Description |
-| ---------------------- | ------ | ---------------------- | ----------- |
-| image.repository       | string | `"opea/embedding-tei"` |             |
-| service.port           | string | `"6000"`               |             |
-| TEI_EMBEDDING_ENDPOINT | string | `""`                   |             |
-| global.monitoring      | bool   | `false`                |             |
+| Key                | Type   | Default  | Description                                                           |
+| ------------------ | ------ | -------- | --------------------------------------------------------------------- |
+| service.port       | string | `"6000"` |                                                                       |
+| EMBEDDING_BACKEND  | string | `"TEI"`  | backend engine to use, one of "TEI", "BridgeTower", "PredictionGuard" |
+| EMBEDDING_ENDPOINT | string | `""`     |                                                                       |
+| global.monitoring  | bool   | `false`  |                                                                       |
diff --git a/helm-charts/common/embedding-usvc/ci-multimodal-values.yaml b/helm-charts/common/embedding-usvc/ci-multimodal-values.yaml
@@ -0,0 +1,13 @@
+# Copyright (C) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+# Default values for embedding-usvc.
+# This is a YAML-formatted file.
+# Declare variables to be passed into your templates.
+
+tei:
+  enabled: false
+mm-embedding:
+  enabled: true
+
+EMBEDDING_BACKEND: "BridgeTower"
diff --git a/helm-charts/common/embedding-usvc/ci-values.yaml b/helm-charts/common/embedding-usvc/ci-values.yaml
@@ -7,3 +7,7 @@
 
 tei:
   enabled: true
+mm-embedding:
+  enabled: false
+
+EMBEDDING_BACKEND: "TEI"
diff --git a/helm-charts/common/embedding-usvc/templates/configmap.yaml b/helm-charts/common/embedding-usvc/templates/configmap.yaml
@@ -8,15 +8,34 @@ metadata:
   labels:
     {{- include "embedding-usvc.labels" . | nindent 4 }}
 data:
-  {{- if .Values.TEI_EMBEDDING_ENDPOINT }}
-  TEI_EMBEDDING_ENDPOINT: {{ .Values.TEI_EMBEDDING_ENDPOINT | quote }}
+  {{- if eq "TEI" .Values.EMBEDDING_BACKEND }}
+  EMBEDDING_COMPONENT_NAME: "OPEA_TEI_EMBEDDING"
+  MULTIMODAL_EMBEDDING: "false"
+  {{- if .Values.EMBEDDING_ENDPOINT }}
+  TEI_EMBEDDING_ENDPOINT: {{ tpl .Values.EMBEDDING_ENDPOINT . | quote }}
   {{- else }}
   TEI_EMBEDDING_ENDPOINT: "http://{{ .Release.Name }}-tei"
   {{- end }}
+  {{- else if eq "PredictionGuard" .Values.EMBEDDING_BACKEND }}
+  MULTIMODAL_EMBEDDING: "false"
+  EMBEDDING_COMPONENT_NAME: "OPEA_PREDICTIONGUARD_EMBEDDING"
+  PG_EMBEDDING_MODEL_NAME: {{ .Values.PG_EMBEDDING_MODEL_NAME | quote }}
+  PREDICTIONGUARD_API_KEY: {{ .Values.PREDICTIONGUARD_API_KEY | quote }}
+  {{- else if eq "BridgeTower" .Values.EMBEDDING_BACKEND }}
+  MULTIMODAL_EMBEDDING: "true"
+  EMBEDDING_COMPONENT_NAME: "OPEA_MULTIMODAL_EMBEDDING_BRIDGETOWER"
+  {{- if .Values.EMBEDDING_ENDPOINT }}
+  MMEI_EMBEDDING_ENDPOINT: {{ tpl .Values.EMBEDDING_ENDPOINT . | quote }}
+  {{- else }}
+  MMEI_EMBEDDING_ENDPOINT: "http://{{ .Release.Name }}-mm-embedding"
+  {{- end }}
+  {{- else }}
+  {{- cat "Invalid EMBEDDING_BACKEND:" .Values.EMBEDDING_BACKEND | fail }}
+  {{- end }}
   http_proxy: {{ .Values.global.http_proxy | quote }}
   https_proxy: {{ .Values.global.https_proxy | quote }}
   {{- if and (not .Values.TEI_EMBEDDING_ENDPOINT) (or .Values.global.http_proxy .Values.global.https_proxy) }}
-  no_proxy: "{{ .Release.Name }}-tei,{{ .Values.global.no_proxy }}"
+  no_proxy: "{{ .Release.Name }}-tei,{{ .Release.Name }}-mm-embedding,{{ .Values.global.no_proxy }}"
   {{- else }}
   no_proxy: {{ .Values.global.no_proxy | quote }}
   {{- end }}

diff --git a/helm-charts/common/embedding-usvc/templates/deployment.yaml b/helm-charts/common/embedding-usvc/templates/deployment.yaml
@@ -28,8 +28,45 @@ spec:
       serviceAccountName: {{ include "embedding-usvc.serviceAccountName" . }}
       securityContext:
         {{- toYaml .Values.podSecurityContext | nindent 8 }}
+      {{- if or (eq "TEI" .Values.EMBEDDING_BACKEND) (eq "BridgeTower" .Values.EMBEDDING_BACKEND) }}
+      initContainers:
+        - name: wait-for-embedding
+          envFrom:
+            - configMapRef:
+                name: {{ include "embedding-usvc.fullname" . }}-config
+            {{- if .Values.global.extraEnvConfig }}
+            - configMapRef:
+                name: {{ .Values.global.extraEnvConfig }}
+                optional: true
+            {{- end }}
+          securityContext:
+            {{- toYaml .Values.securityContext | nindent 12 }}
+          image: busybox:1.36
+          command: ["sh", "-c"]
+          args:
+            - |
+              {{- if eq "TEI" .Values.EMBEDDING_BACKEND }}
+              endpoint=${TEI_EMBEDDING_ENDPOINT};
+              {{- else }}
+              endpoint=${MMEI_EMBEDDING_ENDPOINT};
+              {{- end }}
+              proto=$(echo $endpoint | sed -n 's/.*\(http[s]\?\):\/\/\([^ :]\+\):\?\([0-9]*\).*/\1/p');
+              host=$(echo $endpoint | sed -n 's/.*\(http[s]\?\):\/\/\([^ :]\+\):\?\([0-9]*\).*/\2/p');
+              port=$(echo $endpoint | sed -n 's/.*\(http[s]\?\):\/\/\([^ :]\+\):\?\([0-9]*\).*/\3/p');
+              if [ -z "$port" ]; then
+                  port=80;
+                  [[ "$proto" = "https" ]] && port=443;
+              fi;
+              retry_count={{ .Values.retryCount | default 60 }};
+              j=1;
+              while ! nc -z ${host} ${port}; do
+                [[ $j -ge ${retry_count} ]] && echo "ERROR: ${host}:${port} is NOT reachable in $j seconds!" && exit 1;
+                j=$((j+1)); sleep 1;
+              done;
+              echo "${host}:${port} is reachable within $j seconds.";
+      {{- end }}
       containers:
-        - name: {{ .Release.Name }}
+        - name: {{ .Chart.Name }}
           envFrom:
             - configMapRef:
                 name: {{ include "embedding-usvc.fullname" . }}-config

diff --git a/helm-charts/common/embedding-usvc/templates/tests/test-pod.yaml b/helm-charts/common/embedding-usvc/templates/tests/test-pod.yaml
@@ -21,7 +21,11 @@ spec:
           for ((i=1; i<=max_retry; i++)); do
             curl http://{{ include "embedding-usvc.fullname" . }}:{{ .Values.service.port }}/v1/embeddings -sS --fail-with-body \
             -X POST \
-            -d '{"text":"hello"}' \
+            {{- if eq "BridgeTower" .Values.EMBEDDING_BACKEND }}
+            -d '{"text": {"text" : "This is some sample text."}, "image" : {"url": "https://github.com/docarray/docarray/blob/main/tests/toydata/image-data/apple.png?raw=true"}}' \
+            {{- else }}
+            -d '{"input":"What is Deep Learning?"}' \
+            {{- end }}
             -H 'Content-Type: application/json' && break;
             curlcode=$?
             if [[ $curlcode -eq 7 ]]; then sleep 10; else echo "curl failed with code $curlcode"; exit 1; fi;

diff --git a/helm-charts/common/embedding-usvc/values.yaml b/helm-charts/common/embedding-usvc/values.yaml
@@ -5,18 +5,29 @@
 # This is a YAML-formatted file.
 # Declare variables to be passed into your templates.
 
-tei:
-  enabled: false
+# Configurations for OPEA microservice mm-embedding
+# Set it as a non-null string, such as true, if you want to enable logging facility.
+LOGFLAG: ""
 
-replicaCount: 1
+# embedding need to use different backend embedding engine: TEI, multimodal-bridgetower, PredictionGuard
+# Default is to use the TEI(text-embedding-inference) as the backend
+EMBEDDING_BACKEND: "TEI"
 
-# Set it as a non-null string, such as true, if you want to enable logging facility,
-# otherwise, keep it as "" to disable it.
-LOGFLAG: ""
+# Uncomment and set the following settings to use PredictionGuard as the backend
+# EMBEDDING_BACKEND: "PredictionGuard"
+PG_EMBEDDING_MODEL_NAME: "bridgetower-large-itm-mlm-itc"
+PREDICTIONGUARD_API_KEY: ""
+
+# Uncomment and set the following settings to use embedding-multimodal-bridgetower as the backend
+# EMBEDDING_BACKEND: "BridgeTower"
+
+# common backend embedding service endpoint URL, e.g. "http://tei:80", "http://mm-embedding:80"
+EMBEDDING_ENDPOINT: ""
+
+replicaCount: 1
 
-TEI_EMBEDDING_ENDPOINT: ""
 image:
-  repository: opea/embedding-tei
+  repository: opea/embedding
   # Uncomment the following line to set desired image pull policy if needed, as one of Always, IfNotPresent, Never.
   # pullPolicy: ""
   # Overrides the image tag whose default is the chart appVersion.
@@ -58,25 +69,14 @@ service:
   # The default port for embedding service is 9000
   port: 6000
 
-resources: {}
-  # We usually recommend not to specify default resources and to leave this as a conscious
-  # choice for the user. This also increases chances charts run on environments with little
-  # resources, such as Minikube. If you do want to specify resources, uncomment the following
-  # lines, adjust them as necessary, and remove the curly braces after 'resources:'.
+resources:
   # limits:
   #   cpu: 100m
   #   memory: 128Mi
-  # requests:
-  #   cpu: 100m
-  #   memory: 128Mi
+  requests:
+    cpu: 100m
+    memory: 128Mi
 
-livenessProbe:
-  httpGet:
-    path: v1/health_check
-    port: embedding-usvc
-  initialDelaySeconds: 5
-  periodSeconds: 5
-  failureThreshold: 24
 readinessProbe:
   httpGet:
     path: v1/health_check
@@ -111,3 +111,9 @@ global:
 
   # Prometheus Helm install release name for serviceMonitor
   prometheusRelease: prometheus-stack
+
+# The following is for CI tests only
+tei:
+  enabled: false
+mm-embedding:
+  enabled: false
diff --git a/helm-charts/common/mm-embedding/.helmignore b/helm-charts/common/mm-embedding/.helmignore
@@ -0,0 +1,25 @@
+# Patterns to ignore when building packages.
+# This supports shell glob matching, relative path matching, and
+# negation (prefixed with !). Only one pattern per line.
+.DS_Store
+# Common VCS dirs
+.git/
+.gitignore
+.bzr/
+.bzrignore
+.hg/
+.hgignore
+.svn/
+# Common backup files
+*.swp
+*.bak
+*.tmp
+*.orig
+*~
+# Various IDEs
+.project
+.idea/
+*.tmproj
+.vscode/
+# CI values
+ci*-values.yaml
diff --git a/helm-charts/common/mm-embedding/Chart.yaml b/helm-charts/common/mm-embedding/Chart.yaml
@@ -0,0 +1,9 @@
+# Copyright (C) 2025 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v2
+appVersion: "1.1"
+description: A Helm chart for deploying opea multimodel embedding microservices
+name: mm-embedding
+type: application
+version: 0-latest
diff --git a/helm-charts/common/mm-embedding/README.md b/helm-charts/common/mm-embedding/README.md
@@ -0,0 +1,58 @@
+# OPEA mm-embedding microservice
+
+Helm chart for deploying OPEA multimodal embedding service.
+
+## Installing the Chart
+
+To install the chart, run the following:
+
+```console
+cd GenAIInfra/helm-charts/common
+export MODELDIR=/mnt/opea-models
+export HFTOKEN="insert-your-huggingface-token-here"
+# To deploy embedding-multimodal-bridgetower microserice on CPU
+helm install mm-embedding mm-embedding --set global.modelUseHostPath=${MODELDIR} --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN}
+# To deploy embedding-multimodal-bridgetower microserice on Gaudi
+# helm install mm-embedding mm-embedding --set global.modelUseHostPath=${MODELDIR} --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --values mm-embedding/gaudi-values.yaml
+# To deploy embedding-multimodal-clip microserice on CPU
+# helm install mm-embedding mm-embedding --set global.modelUseHostPath=${MODELDIR} --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --values mm-embedding/variant_clip-values.yaml
+```
+
+By default, the embedding-multimodal-bridgetower service will download the "BridgeTower/bridgetower-large-itm-mlm-itc" model which is about 3.5GB, and the embedding-multimodal-clip service will download the "openai/clip-vit-base-patch32" model which is about 1.7GB.
+
+If you already cached the model locally, you can pass it to container like this example:
+
+MODELDIR=/mnt/opea-models
+
+MODELNAME="/data/models--BridgeTower--bridgetower-large-itm-mlm-itc"
+
+## Verify
+
+To verify the installation, run the command `kubectl get pod` to make sure all pods are runinng and in ready state.
+
+Then run the command `kubectl port-forward svc/mm-embedding 6990:6990` to expose the mm-embedding service for access.
+
+Open another terminal and run the following command to verify the service if working:
+
+```console
+# Verify with embedding-multimodal-bridgetower
+curl http://localhost:6990/v1/encode \
+    -XPOST \
+    -d '{"text":"This is example"}' \
+    -H 'Content-Type: application/json'
+
+# Verify with embedding-multimodal-clip
+curl http://localhost:6990/v1/embeddings \
+    -XPOST \
+    -d '{"text":"This is example"}' \
+    -H 'Content-Type: application/json'
+```
+
+## Values
+
+| Key                             | Type   | Default                              | Description                                                                                                                                                                                                               |
+| ------------------------------- | ------ | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| global.HUGGINGFACEHUB_API_TOKEN | string | `insert-your-huggingface-token-here` | Hugging Face API token                                                                                                                                                                                                    |
+| global.modelUseHostPath         | string | `""`                                 | Cached models directory, service will not download if the model is cached here. The host path "modelUseHostPath" will be mounted to container as /data directory. Set this to null/empty will force it to download model. |
+| autoscaling.enabled             | bool   | `false`                              | Enable HPA autoscaling for the service deployment based on metrics it provides. See [HPA instructions](../../HPA.md) before enabling!                                                                                     |
+| global.monitoring               | bool   | `false`                              | Enable usage metrics for the service. Required for HPA. See [monitoring instructions](../../monitoring.md) before enabling!                                                                                               |
diff --git a/helm-charts/common/mm-embedding/ci-clip-values.yaml b/helm-charts/common/mm-embedding/ci-clip-values.yaml
@@ -0,0 +1 @@
+variant_clip-values.yaml