Skip to content

Commit

Permalink
Modify embedding-usvc to support multimodal embedding
Browse files Browse the repository at this point in the history
- Change embedding-usvc chart to adapt to latest chagnes
- Support multimodal embedding

Signed-off-by: Lianhao Lu <[email protected]>
  • Loading branch information
lianhao committed Jan 15, 2025
1 parent 7b35326 commit 07b02af
Show file tree
Hide file tree
Showing 25 changed files with 813 additions and 49 deletions.
2 changes: 2 additions & 0 deletions helm-charts/common/embedding-usvc/.helmignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,5 @@
.idea/
*.tmproj
.vscode/
# CI values
ci*-values.yaml
4 changes: 4 additions & 0 deletions helm-charts/common/embedding-usvc/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,7 @@ dependencies:
version: 0-latest
repository: file://../tei
condition: tei.enabled
- name: mm-embedding
version: 0-latest
repository: file://../mm-embedding
condition: mm-embedding.enabled
61 changes: 40 additions & 21 deletions helm-charts/common/embedding-usvc/README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,42 @@
# embedding-usvc

Helm chart for deploying embedding microservice.
Helm chart for deploying OPEA embedding microservice.

embedding-usvc depends on TEI, set TEI_EMBEDDING_ENDPOINT.
## Installing the chart

## (Option1): Installing the chart separately
The OPEA embedding microservice depends on one of the following backend services:

First, you need to install the tei chart, please refer to the [tei](../tei) chart for more information.
- TEI: please refer to [tei](../tei) chart for more information

After you've deployted the tei chart successfully, please run `kubectl get svc` to get the tei service endpoint, i.e. `http://tei`.
- multimodal embedding BridgeTower: please refer to [mm-embedding](../mm-embedding) chart for more information.

- prediction guard: please refert to external [Prediction Guard](https://predictionguard.com) for more information.

First, you need to get the dependent service deployed, i.e. deploy the tei helm chart, mm-embedding helm chart, or contact prediction guard to get access info.

After you've deployed the dependent service successfully, please run `kubectl get svc` to get the backend service URL, e.g. `http://tei`, `http://mm-embedding`.

To install the embedding-usvc chart, run the following:

```console
cd GenAIInfra/helm-charts/common/embedding-usvc
export TEI_EMBEDDING_ENDPOINT="http://tei"
helm dependency update
helm install embedding-usvc . --set TEI_EMBEDDING_ENDPOINT=${TEI_EMBEDDING_ENDPOINT}
```

## (Option2): Installing the chart with dependencies automatically
# Use TEI as the backend(default)
export EMBEDDING_BACKEND="TEI"
export EMBEDDING_ENDPOINT="http://tei"
helm install embedding-usvc . --set EMBEDDING_BACKEND=${EMBEDDING_BACKEND} --set EMBEDDING_ENDPOINT=${EMBEDDING_ENDPOINT}

# Use multimodal embedding BridgeTower as the backend
# export EMBEDDING_BACKEND="BridgeTower"
# export EMBEDDING_ENDPOINT="http://mm-embedding"
# helm install embedding-usvc . --set EMBEDDING_BACKEND=${EMBEDDING_BACKEND} --set EMBEDDING_ENDPOINT=${EMBEDDING_ENDPOINT}

# Use predcition guard as the backend
# export EMBEDDING_BACKEND="PredictionGuard"
# export API_KEY=<your PedictionGuard api key>
# helm install embedding-usvc . --set EMBEDDING_BACKEND=${EMBEDDING_BACKEND} --set PREDICTIONGUARD_API_KEY=${API_KEY}

```console
cd GenAIInfra/helm-charts/common/embedding-usvc
helm dependency update
helm install embedding-usvc . --set tei.enabled=true
```

## Verify
Expand All @@ -36,17 +48,24 @@ Then run the command `kubectl port-forward svc/embedding-usvc 6000:6000` to expo
Open another terminal and run the following command to verify the service if working:

```console
# Verify with TEI or prediction guard backend:
curl http://localhost:6000/v1/embeddings \
-X POST \
-H 'Content-Type: application/json' \
-d '{"input":"What is Deep Learning?"}'

# Verify with multimodal embedding BridgeTower backend:
curl http://localhost:6000/v1/embeddings \
-X POST \
-d '{"text":"hello"}' \
-H 'Content-Type: application/json'
-H 'Content-Type: application/json' \
-d '{"text": {"text" : "This is some sample text."}, "image" : {"url": "https://github.com/docarray/docarray/blob/main/tests/toydata/image-data/apple.png?raw=true"}}'
```

## Values

| Key | Type | Default | Description |
| ---------------------- | ------ | ---------------------- | ----------- |
| image.repository | string | `"opea/embedding-tei"` | |
| service.port | string | `"6000"` | |
| TEI_EMBEDDING_ENDPOINT | string | `""` | |
| global.monitoring | bool | `false` | |
| Key | Type | Default | Description |
| ------------------ | ------ | -------- | --------------------------------------------------------------------- |
| service.port | string | `"6000"` | |
| EMBEDDING_BACKEND | string | `"TEI"` | backend engine to use, one of "TEI", "BridgeTower", "PredictionGuard" |
| EMBEDDING_ENDPOINT | string | `""` | |
| global.monitoring | bool | `false` | |
13 changes: 13 additions & 0 deletions helm-charts/common/embedding-usvc/ci-multimodal-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# Default values for embedding-usvc.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

tei:
enabled: false
mm-embedding:
enabled: true

EMBEDDING_BACKEND: "BridgeTower"
4 changes: 4 additions & 0 deletions helm-charts/common/embedding-usvc/ci-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,7 @@

tei:
enabled: true
mm-embedding:
enabled: false

EMBEDDING_BACKEND: "TEI"
25 changes: 22 additions & 3 deletions helm-charts/common/embedding-usvc/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,34 @@ metadata:
labels:
{{- include "embedding-usvc.labels" . | nindent 4 }}
data:
{{- if .Values.TEI_EMBEDDING_ENDPOINT }}
TEI_EMBEDDING_ENDPOINT: {{ .Values.TEI_EMBEDDING_ENDPOINT | quote }}
{{- if eq "TEI" .Values.EMBEDDING_BACKEND }}
EMBEDDING_COMPONENT_NAME: "OPEA_TEI_EMBEDDING"
MULTIMODAL_EMBEDDING: "false"
{{- if .Values.EMBEDDING_ENDPOINT }}
TEI_EMBEDDING_ENDPOINT: {{ tpl .Values.EMBEDDING_ENDPOINT . | quote }}
{{- else }}
TEI_EMBEDDING_ENDPOINT: "http://{{ .Release.Name }}-tei"
{{- end }}
{{- else if eq "PredictionGuard" .Values.EMBEDDING_BACKEND }}
MULTIMODAL_EMBEDDING: "false"
EMBEDDING_COMPONENT_NAME: "OPEA_PREDICTIONGUARD_EMBEDDING"
PG_EMBEDDING_MODEL_NAME: {{ .Values.PG_EMBEDDING_MODEL_NAME | quote }}
PREDICTIONGUARD_API_KEY: {{ .Values.PREDICTIONGUARD_API_KEY | quote }}
{{- else if eq "BridgeTower" .Values.EMBEDDING_BACKEND }}
MULTIMODAL_EMBEDDING: "true"
EMBEDDING_COMPONENT_NAME: "OPEA_MULTIMODAL_EMBEDDING_BRIDGETOWER"
{{- if .Values.EMBEDDING_ENDPOINT }}
MMEI_EMBEDDING_ENDPOINT: {{ tpl .Values.EMBEDDING_ENDPOINT . | quote }}
{{- else }}
MMEI_EMBEDDING_ENDPOINT: "http://{{ .Release.Name }}-mm-embedding"
{{- end }}
{{- else }}
{{- cat "Invalid EMBEDDING_BACKEND:" .Values.EMBEDDING_BACKEND | fail }}
{{- end }}
http_proxy: {{ .Values.global.http_proxy | quote }}
https_proxy: {{ .Values.global.https_proxy | quote }}
{{- if and (not .Values.TEI_EMBEDDING_ENDPOINT) (or .Values.global.http_proxy .Values.global.https_proxy) }}
no_proxy: "{{ .Release.Name }}-tei,{{ .Values.global.no_proxy }}"
no_proxy: "{{ .Release.Name }}-tei,{{ .Release.Name }}-mm-embedding,{{ .Values.global.no_proxy }}"
{{- else }}
no_proxy: {{ .Values.global.no_proxy | quote }}
{{- end }}
Expand Down
39 changes: 38 additions & 1 deletion helm-charts/common/embedding-usvc/templates/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,45 @@ spec:
serviceAccountName: {{ include "embedding-usvc.serviceAccountName" . }}
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
{{- if or (eq "TEI" .Values.EMBEDDING_BACKEND) (eq "BridgeTower" .Values.EMBEDDING_BACKEND) }}
initContainers:
- name: wait-for-embedding
envFrom:
- configMapRef:
name: {{ include "embedding-usvc.fullname" . }}-config
{{- if .Values.global.extraEnvConfig }}
- configMapRef:
name: {{ .Values.global.extraEnvConfig }}
optional: true
{{- end }}
securityContext:
{{- toYaml .Values.securityContext | nindent 12 }}
image: busybox:1.36
command: ["sh", "-c"]
args:
- |
{{- if eq "TEI" .Values.EMBEDDING_BACKEND }}
endpoint=${TEI_EMBEDDING_ENDPOINT};
{{- else }}
endpoint=${MMEI_EMBEDDING_ENDPOINT};
{{- end }}
proto=$(echo $endpoint | sed -n 's/.*\(http[s]\?\):\/\/\([^ :]\+\):\?\([0-9]*\).*/\1/p');
host=$(echo $endpoint | sed -n 's/.*\(http[s]\?\):\/\/\([^ :]\+\):\?\([0-9]*\).*/\2/p');
port=$(echo $endpoint | sed -n 's/.*\(http[s]\?\):\/\/\([^ :]\+\):\?\([0-9]*\).*/\3/p');
if [ -z "$port" ]; then
port=80;
[[ "$proto" = "https" ]] && port=443;
fi;
retry_count={{ .Values.retryCount | default 60 }};
j=1;
while ! nc -z ${host} ${port}; do
[[ $j -ge ${retry_count} ]] && echo "ERROR: ${host}:${port} is NOT reachable in $j seconds!" && exit 1;
j=$((j+1)); sleep 1;
done;
echo "${host}:${port} is reachable within $j seconds.";
{{- end }}
containers:
- name: {{ .Release.Name }}
- name: {{ .Chart.Name }}
envFrom:
- configMapRef:
name: {{ include "embedding-usvc.fullname" . }}-config
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,11 @@ spec:
for ((i=1; i<=max_retry; i++)); do
curl http://{{ include "embedding-usvc.fullname" . }}:{{ .Values.service.port }}/v1/embeddings -sS --fail-with-body \
-X POST \
-d '{"text":"hello"}' \
{{- if eq "BridgeTower" .Values.EMBEDDING_BACKEND }}
-d '{"text": {"text" : "This is some sample text."}, "image" : {"url": "https://github.com/docarray/docarray/blob/main/tests/toydata/image-data/apple.png?raw=true"}}' \
{{- else }}
-d '{"input":"What is Deep Learning?"}' \
{{- end }}
-H 'Content-Type: application/json' && break;
curlcode=$?
if [[ $curlcode -eq 7 ]]; then sleep 10; else echo "curl failed with code $curlcode"; exit 1; fi;
Expand Down
52 changes: 29 additions & 23 deletions helm-charts/common/embedding-usvc/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,29 @@
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

tei:
enabled: false
# Configurations for OPEA microservice mm-embedding
# Set it as a non-null string, such as true, if you want to enable logging facility.
LOGFLAG: ""

replicaCount: 1
# embedding need to use different backend embedding engine: TEI, multimodal-bridgetower, PredictionGuard
# Default is to use the TEI(text-embedding-inference) as the backend
EMBEDDING_BACKEND: "TEI"

# Set it as a non-null string, such as true, if you want to enable logging facility,
# otherwise, keep it as "" to disable it.
LOGFLAG: ""
# Uncomment and set the following settings to use PredictionGuard as the backend
# EMBEDDING_BACKEND: "PredictionGuard"
PG_EMBEDDING_MODEL_NAME: "bridgetower-large-itm-mlm-itc"
PREDICTIONGUARD_API_KEY: ""

# Uncomment and set the following settings to use embedding-multimodal-bridgetower as the backend
# EMBEDDING_BACKEND: "BridgeTower"

# common backend embedding service endpoint URL, e.g. "http://tei:80", "http://mm-embedding:80"
EMBEDDING_ENDPOINT: ""

replicaCount: 1

TEI_EMBEDDING_ENDPOINT: ""
image:
repository: opea/embedding-tei
repository: opea/embedding
# Uncomment the following line to set desired image pull policy if needed, as one of Always, IfNotPresent, Never.
# pullPolicy: ""
# Overrides the image tag whose default is the chart appVersion.
Expand Down Expand Up @@ -58,25 +69,14 @@ service:
# The default port for embedding service is 9000
port: 6000

resources: {}
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
resources:
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
requests:
cpu: 100m
memory: 128Mi

livenessProbe:
httpGet:
path: v1/health_check
port: embedding-usvc
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 24
readinessProbe:
httpGet:
path: v1/health_check
Expand Down Expand Up @@ -111,3 +111,9 @@ global:

# Prometheus Helm install release name for serviceMonitor
prometheusRelease: prometheus-stack

# The following is for CI tests only
tei:
enabled: false
mm-embedding:
enabled: false
25 changes: 25 additions & 0 deletions helm-charts/common/mm-embedding/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
# CI values
ci*-values.yaml
9 changes: 9 additions & 0 deletions helm-charts/common/mm-embedding/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v2
appVersion: "1.1"
description: A Helm chart for deploying opea multimodel embedding microservices
name: mm-embedding
type: application
version: 0-latest
58 changes: 58 additions & 0 deletions helm-charts/common/mm-embedding/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# OPEA mm-embedding microservice

Helm chart for deploying OPEA multimodal embedding service.

## Installing the Chart

To install the chart, run the following:

```console
cd GenAIInfra/helm-charts/common
export MODELDIR=/mnt/opea-models
export HFTOKEN="insert-your-huggingface-token-here"
# To deploy embedding-multimodal-bridgetower microserice on CPU
helm install mm-embedding mm-embedding --set global.modelUseHostPath=${MODELDIR} --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN}
# To deploy embedding-multimodal-bridgetower microserice on Gaudi
# helm install mm-embedding mm-embedding --set global.modelUseHostPath=${MODELDIR} --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --values mm-embedding/gaudi-values.yaml
# To deploy embedding-multimodal-clip microserice on CPU
# helm install mm-embedding mm-embedding --set global.modelUseHostPath=${MODELDIR} --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --values mm-embedding/variant_clip-values.yaml
```

By default, the embedding-multimodal-bridgetower service will download the "BridgeTower/bridgetower-large-itm-mlm-itc" model which is about 3.5GB, and the embedding-multimodal-clip service will download the "openai/clip-vit-base-patch32" model which is about 1.7GB.

If you already cached the model locally, you can pass it to container like this example:

MODELDIR=/mnt/opea-models

MODELNAME="/data/models--BridgeTower--bridgetower-large-itm-mlm-itc"

## Verify

To verify the installation, run the command `kubectl get pod` to make sure all pods are runinng and in ready state.

Then run the command `kubectl port-forward svc/mm-embedding 6990:6990` to expose the mm-embedding service for access.

Open another terminal and run the following command to verify the service if working:

```console
# Verify with embedding-multimodal-bridgetower
curl http://localhost:6990/v1/encode \
-XPOST \
-d '{"text":"This is example"}' \
-H 'Content-Type: application/json'

# Verify with embedding-multimodal-clip
curl http://localhost:6990/v1/embeddings \
-XPOST \
-d '{"text":"This is example"}' \
-H 'Content-Type: application/json'
```

## Values

| Key | Type | Default | Description |
| ------------------------------- | ------ | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| global.HUGGINGFACEHUB_API_TOKEN | string | `insert-your-huggingface-token-here` | Hugging Face API token |
| global.modelUseHostPath | string | `""` | Cached models directory, service will not download if the model is cached here. The host path "modelUseHostPath" will be mounted to container as /data directory. Set this to null/empty will force it to download model. |
| autoscaling.enabled | bool | `false` | Enable HPA autoscaling for the service deployment based on metrics it provides. See [HPA instructions](../../HPA.md) before enabling! |
| global.monitoring | bool | `false` | Enable usage metrics for the service. Required for HPA. See [monitoring instructions](../../monitoring.md) before enabling! |
1 change: 1 addition & 0 deletions helm-charts/common/mm-embedding/ci-clip-values.yaml
Loading

0 comments on commit 07b02af

Please sign in to comment.