Skip to content

Commit

Permalink
Modify embedding-usvc to support multimodal embedding
Browse files Browse the repository at this point in the history
- Change embedding-usvc chart to adapt to latest chagnes
- Support multimodal embedding

Signed-off-by: Lianhao Lu <[email protected]>
  • Loading branch information
lianhao committed Jan 10, 2025
1 parent d9e5ed0 commit cdd38a2
Show file tree
Hide file tree
Showing 27 changed files with 780 additions and 64 deletions.
2 changes: 2 additions & 0 deletions helm-charts/common/embedding-usvc/.helmignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,5 @@
.idea/
*.tmproj
.vscode/
# CI values
ci*-values.yaml
4 changes: 4 additions & 0 deletions helm-charts/common/embedding-usvc/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,7 @@ dependencies:
version: 0-latest
repository: file://../tei
condition: tei.enabled
- name: mm-embedding
version: 0-latest
repository: file://../mm-embedding
condition: mm-embedding.enabled
61 changes: 40 additions & 21 deletions helm-charts/common/embedding-usvc/README.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,42 @@
# embedding-usvc

Helm chart for deploying embedding microservice.
Helm chart for deploying OPEA embedding microservice.

embedding-usvc depends on TEI, set TEI_EMBEDDING_ENDPOINT.
## Installing the chart

## (Option1): Installing the chart separately
The OPEA embedding microservice depends on one of the following backend services:

First, you need to install the tei chart, please refer to the [tei](../tei) chart for more information.
- text embedding inference: please refer to [tei](../tei) chart for more information

After you've deployted the tei chart successfully, please run `kubectl get svc` to get the tei service endpoint, i.e. `http://tei`.
- multimodal embedding bridgetower: please refer to [mm-embedding](../mm-embedding) chart for more information.

- prediction guard: please refert to external [Prediction Guard](https://predictionguard.com) for more information.

First, you need to get the dependent service deployed, i.e. deploy the tei helm chart, mm-embedding helm chart, or contact prediction guard to get access info.

After you've deployed the successfully, please run `kubectl get svc` to get the backend service URL, e.g. `http://tei`, `http://mm-embedding`.

To install the embedding-usvc chart, run the following:

```console
cd GenAIInfra/helm-charts/common/embedding-usvc
export TEI_EMBEDDING_ENDPOINT="http://tei"
helm dependency update
helm install embedding-usvc . --set TEI_EMBEDDING_ENDPOINT=${TEI_EMBEDDING_ENDPOINT}
```

## (Option2): Installing the chart with dependencies automatically
# Use tei as the backend(default)
export EMBEDDING_COMPONENT_NAME="OPEA_TEI_EMBEDDING"
export EMBEDDING_ENDPOINT="http://tei"
helm install embedding-usvc . --set EMBEDDING_COMPONENT_NAME=${EMBEDDING_COMPONENT_NAME} --set TEI_EMBEDDING_ENDPOINT=${EMBEDDING_ENDPOINT}

# Use multimodal embedding bridgetower as the backend
# export EMBEDDING_COMPONENT_NAME="OPEA_MULTIMODAL_EMBEDDING_BRIDGETOWER"
# export EMBEDDING_ENDPOINT="http://mm-embedding"
# helm install embedding-usvc . --set EMBEDDING_COMPONENT_NAME=${EMBEDDING_COMPONENT_NAME} --set MMEI_EMBEDDING_ENDPOINT=${EMBEDDING_ENDPOINT} --set MULTIMODAL_EMBEDDING=true

# Use predcition guard as the backend
# export EMBEDDING_COMPONENT_NAME="OPEA_PREDICTIONGUARD_EMBEDDING"
# export API_KEY=<your prediction guard api key>
# helm install embedding-usvc . --set EMBEDDING_COMPONENT_NAME=${EMBEDDING_COMPONENT_NAME} --set PREDICTIONGUARD_API_KEY=${API_KEY}

```console
cd GenAIInfra/helm-charts/common/embedding-usvc
helm dependency update
helm install embedding-usvc . --set tei.enabled=true
```

## Verify
Expand All @@ -36,17 +48,24 @@ Then run the command `kubectl port-forward svc/embedding-usvc 6000:6000` to expo
Open another terminal and run the following command to verify the service if working:

```console
# Verify with tei or prediction guard backend:
curl http://localhost:6000/v1/embeddings \
-X POST \
-H 'Content-Type: application/json' \
-d '{"input":"What is Deep Learning?"}'

# Verify with multimodal embedding bridgetower backend:
curl http://localhost:6000/v1/embeddings \
-X POST \
-d '{"text":"hello"}' \
-H 'Content-Type: application/json'
-H 'Content-Type: application/json' \
-d '{"text": {"text" : "This is some sample text."}, "image" : {"url": "https://github.com/docarray/docarray/blob/main/tests/toydata/image-data/apple.png?raw=true"}}'
```

## Values

| Key | Type | Default | Description |
| ---------------------- | ------ | ---------------------- | ----------- |
| image.repository | string | `"opea/embedding-tei"` | |
| service.port | string | `"6000"` | |
| TEI_EMBEDDING_ENDPOINT | string | `""` | |
| global.monitoring | bool | `false` | |
| Key | Type | Default | Description |
| ------------------------ | ------ | ---------------------- | -------------------------- |
| image.repository | string | `"opea/embedding-tei"` | |
| service.port | string | `"6000"` | |
| EMBEDDING_COMPONENT_NAME | string | `"OPEA_TEI_EMBEDDING"` | backend service to talk to |
| global.monitoring | bool | `false` | |
14 changes: 14 additions & 0 deletions helm-charts/common/embedding-usvc/ci-multimodal-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# Default values for embedding-usvc.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

tei:
enabled: false
mm-embedding:
enabled: true

MULTIMODAL_EMBEDDING: true
EMBEDDING_COMPONENT_NAME: "OPEA_MULTIMODAL_EMBEDDING_BRIDGETOWER"
5 changes: 5 additions & 0 deletions helm-charts/common/embedding-usvc/ci-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,8 @@

tei:
enabled: true
mm-embedding:
enabled: false

MULTIMODAL_EMBEDDING: false
EMBEDDING_COMPONENT_NAME: "OPEA_TEI_EMBEDDING"
18 changes: 16 additions & 2 deletions helm-charts/common/embedding-usvc/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,29 @@ metadata:
labels:
{{- include "embedding-usvc.labels" . | nindent 4 }}
data:
MULTIMODAL_EMBEDDING: {{ .Values.MULTIMODAL_EMBEDDING | quote }}
EMBEDDING_COMPONENT_NAME: {{ .Values.EMBEDDING_COMPONENT_NAME | quote }}
{{- if eq .Values.EMBEDDING_COMPONENT_NAME "OPEA_TEI_EMBEDDING" }}
{{- if .Values.TEI_EMBEDDING_ENDPOINT }}
TEI_EMBEDDING_ENDPOINT: {{ .Values.TEI_EMBEDDING_ENDPOINT | quote }}
TEI_EMBEDDING_ENDPOINT: {{ tpl .Values.TEI_EMBEDDING_ENDPOINT . | quote }}
{{- else }}
TEI_EMBEDDING_ENDPOINT: "http://{{ .Release.Name }}-tei"
{{- end }}
{{- else if eq .Values.EMBEDDING_COMPONENT_NAME "OPEA_PREDICTIONGUARD_EMBEDDING" }}
PG_EMBEDDING_MODEL_NAME: {{ .Values.PG_EMBEDDING_MODEL_NAME | quote }}
PREDICTIONGUARD_API_KEY: {{ .Values.PREDICTIONGUARD_API_KEY | quote }}
{{- else if eq .Values.EMBEDDING_COMPONENT_NAME "OPEA_MULTIMODAL_EMBEDDING_BRIDGETOWER" }}
MULTIMODAL_EMBEDDING: "true"
{{- if .Values.MMEI_EMBEDDING_ENDPOINT }}
MMEI_EMBEDDING_ENDPOINT: {{ tpl .Values.MMEI_EMBEDDING_ENDPOINT . | quote }}
{{- else }}
MMEI_EMBEDDING_ENDPOINT: "http://{{ .Release.Name }}-mm-embedding"
{{- end }}
{{- end }}
http_proxy: {{ .Values.global.http_proxy | quote }}
https_proxy: {{ .Values.global.https_proxy | quote }}
{{- if and (not .Values.TEI_EMBEDDING_ENDPOINT) (or .Values.global.http_proxy .Values.global.https_proxy) }}
no_proxy: "{{ .Release.Name }}-tei,{{ .Values.global.no_proxy }}"
no_proxy: "{{ .Release.Name }}-tei,{{ .Release.Name }}-mm-embedding,{{ .Values.global.no_proxy }}"
{{- else }}
no_proxy: {{ .Values.global.no_proxy | quote }}
{{- end }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ spec:
securityContext:
{{- toYaml .Values.podSecurityContext | nindent 8 }}
containers:
- name: {{ .Release.Name }}
- name: {{ .Chart.Name }}
envFrom:
- configMapRef:
name: {{ include "embedding-usvc.fullname" . }}-config
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,11 @@ spec:
for ((i=1; i<=max_retry; i++)); do
curl http://{{ include "embedding-usvc.fullname" . }}:{{ .Values.service.port }}/v1/embeddings -sS --fail-with-body \
-X POST \
-d '{"text":"hello"}' \
{{- if eq .Values.EMBEDDING_COMPONENT_NAME "OPEA_MULTIMODAL_EMBEDDING_BRIDGETOWER" }}
-d '{"text": {"text" : "This is some sample text."}, "image" : {"url": "https://github.com/docarray/docarray/blob/main/tests/toydata/image-data/apple.png?raw=true"}}' \
{{- else }}
-d '{"input":"What is Deep Learning?"}' \
{{- end }}
-H 'Content-Type: application/json' && break;
curlcode=$?
if [[ $curlcode -eq 7 ]]; then sleep 10; else echo "curl failed with code $curlcode"; exit 1; fi;
Expand Down
53 changes: 30 additions & 23 deletions helm-charts/common/embedding-usvc/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,30 @@
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

tei:
enabled: false

replicaCount: 1

# Set it as a non-null string, such as true, if you want to enable logging facility,
# otherwise, keep it as "" to disable it.
# Configurations for OPEA microservice mm-embedding
# Set it as a non-null string, such as true, if you want to enable logging facility.
LOGFLAG: ""

# embedding need to talk to different backend services: tei, multimodal-bridgetower, predictionGuard
# Default is to use the tei(text-embedding-inference) as the backend
MULTIMODAL_EMBEDDING: false
EMBEDDING_COMPONENT_NAME: "OPEA_TEI_EMBEDDING"
TEI_EMBEDDING_ENDPOINT: ""

# Uncomment and set the following settings to use predictionGuard as the backend
# EMBEDDING_COMPONENT_NAME: "OPEA_PREDICTIONGUARD_EMBEDDING"
# PG_EMBEDDING_MODEL_NAME: "bridgetower-large-itm-mlm-itc"
# PREDICTIONGUARD_API_KEY: ""

# Uncomment and set the following settings to use embedding-multimodal-bridgetower as the backend
# MULTIMODAL_EMBEDDING: true
# EMBEDDING_COMPONENT_NAME: "OPEA_MULTIMODAL_EMBEDDING_BRIDGETOWER"
# MMEI_EMBEDDING_ENDPOINT: ""

replicaCount: 1

image:
repository: opea/embedding-tei
repository: opea/embedding
# Uncomment the following line to set desired image pull policy if needed, as one of Always, IfNotPresent, Never.
# pullPolicy: ""
# Overrides the image tag whose default is the chart appVersion.
Expand Down Expand Up @@ -58,25 +70,14 @@ service:
# The default port for embedding service is 9000
port: 6000

resources: {}
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
resources:
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
requests:
cpu: 100m
memory: 128Mi

livenessProbe:
httpGet:
path: v1/health_check
port: embedding-usvc
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 24
readinessProbe:
httpGet:
path: v1/health_check
Expand Down Expand Up @@ -111,3 +112,9 @@ global:

# Prometheus Helm install release name for serviceMonitor
prometheusRelease: prometheus-stack

# The following is for CI tests only
tei:
enabled: false
mm-embedding:
enabled: false
25 changes: 25 additions & 0 deletions helm-charts/common/mm-embedding/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
# CI values
ci*-values.yaml
9 changes: 9 additions & 0 deletions helm-charts/common/mm-embedding/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v2
appVersion: "1.1"
description: A Helm chart for deploying opea multimodel embedding microservices
name: mm-embedding
type: application
version: 0-latest
58 changes: 58 additions & 0 deletions helm-charts/common/mm-embedding/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# OPEA mm-embedding microservice

Helm chart for deploying OPEA multimodal embedding service.

## Installing the Chart

To install the chart, run the following:

```console
cd GenAIInfra/helm-charts/common
export MODELDIR=/mnt/opea-models
export HFTOKEN="insert-your-huggingface-token-here"
# To deploy embedding-multimodal-bridgetower microserice on CPU
helm install mm-embedding mm-embedding --set global.modelUseHostPath=${MODELDIR} --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN}
# To deploy embedding-multimodal-bridgetower microserice on Gaudi
# helm install mm-embedding mm-embedding --set global.modelUseHostPath=${MODELDIR} --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --values mm-embedding/gaudi-values.yaml
# To deploy embedding-multimodal-clip microserice on CPU
helm install mm-embedding mm-embedding --set global.modelUseHostPath=${MODELDIR} --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} --values mm-embedding/variant_clip-values.yaml
```

By default, the embedding-multimodal-bridgetower service will downloading the "BridgeTower/bridgetower-large-itm-mlm-itc" download which is about 3.5GB, and the embedding-multimodal-clip service will download the "openai/clip-vit-base-patch32" model which is about 1.7GB.

If you already cached the model locally, you can pass it to container like this example:

MODELDIR=/mnt/opea-models

MODELNAME="/data/models--BridgeTower--bridgetower-large-itm-mlm-itc"

## Verify

To verify the installation, run the command `kubectl get pod` to make sure all pods are runinng and in ready state.

Then run the command `kubectl port-forward svc/mm-embedding 6990:6990` to expose the mm-embedding service for access.

Open another terminal and run the following command to verify the service if working:

```console
# Verify with embedding-multimodal-bridgetower
curl http://localhost:6990/v1/encode \
-XPOST \
-d '{"text":"This is example"}' \
-H 'Content-Type: application/json'

# Verify with embedding-multimodal-clip
curl http://localhost:6990/v1/embeddings \
-XPOST \
-d '{"text":"This is example"}' \
-H 'Content-Type: application/json'
```

## Values

| Key | Type | Default | Description |
| ------------------------------- | ------ | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| global.HUGGINGFACEHUB_API_TOKEN | string | `insert-your-huggingface-token-here` | Hugging Face API token |
| global.modelUseHostPath | string | `""` | Cached models directory, service will not download if the model is cached here. The host path "modelUseHostPath" will be mounted to container as /data directory. Set this to null/empty will force it to download model. |
| autoscaling.enabled | bool | `false` | Enable HPA autoscaling for the service deployment based on metrics it provides. See [HPA instructions](../../HPA.md) before enabling! |
| global.monitoring | bool | `false` | Enable usage metrics for the service. Required for HPA. See [monitoring instructions](../../monitoring.md) before enabling! |
1 change: 1 addition & 0 deletions helm-charts/common/mm-embedding/ci-clip-values.yaml
1 change: 1 addition & 0 deletions helm-charts/common/mm-embedding/ci-values.yaml
22 changes: 22 additions & 0 deletions helm-charts/common/mm-embedding/gaudi-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

accelDevice: "gaudi"

image:
repository: opea/embedding-multimodal-bridgetower-gaudi
tag: "latest"

resources:
limits:
habana.ai/gaudi: 1

readinessProbe:
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 1
startupProbe:
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 1
failureThreshold: 120
Loading

0 comments on commit cdd38a2

Please sign in to comment.