Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add kubernetes deployment for GenAIComps #1104

Merged
merged 2 commits into from
Jan 13, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions comps/agent/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy Agent microservice on Kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Kubernetes

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install agent oci://ghcr.io/opea-project/charts/agent --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-values.yaml
```
38 changes: 38 additions & 0 deletions comps/agent/deployment/kubernetes/gaudi-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# Accelerate inferencing in heaviest components to improve performance
# by overriding their subchart values

tgi:
enabled: true
accelDevice: "gaudi"
image:
repository: ghcr.io/huggingface/tgi-gaudi
tag: "2.0.6"
resources:
limits:
habana.ai/gaudi: 4
MAX_INPUT_LENGTH: "4096"
MAX_TOTAL_TOKENS: "8192"
CUDA_GRAPHS: ""
OMPI_MCA_btl_vader_single_copy_mechanism: "none"
PT_HPU_ENABLE_LAZY_COLLECTIVES: "true"
ENABLE_HPU_GRAPH: "true"
LIMIT_HPU_GRAPH: "true"
USE_FLASH_ATTENTION: "true"
FLASH_ATTENTION_RECOMPUTE: "true"
extraCmdArgs: ["--sharded","true","--num-shard","4"]
livenessProbe:
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 1
readinessProbe:
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 1
startupProbe:
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 1
failureThreshold: 120
11 changes: 11 additions & 0 deletions comps/asr/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy ASR microservice on Kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Kubernetes

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install asr oci://ghcr.io/opea-project/charts/asr --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
5 changes: 5 additions & 0 deletions comps/asr/deployment/kubernetes/cpu-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

whisper:
enabled: true
11 changes: 11 additions & 0 deletions comps/chathistory/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy chathistory microservice on Kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Kubernetes

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install chathistory-usvc oci://ghcr.io/opea-project/charts/chathistory-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
5 changes: 5 additions & 0 deletions comps/chathistory/deployment/kubernetes/cpu-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

mongodb:
enabled: true
18 changes: 18 additions & 0 deletions comps/dataprep/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Deploy dataprep microservice on Kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Kubernetes with redis VectorDB

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install data-prep oci://ghcr.io/opea-project/charts/data-prep --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f redis-values.yaml
```

## Deploy on Kubernetes with milvus VectorDB

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install data-prep oci://ghcr.io/opea-project/charts/data-prep --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f milvus-values.yaml
```
30 changes: 30 additions & 0 deletions comps/dataprep/deployment/kubernetes/milvus-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

milvus:
enabled: true
cluster:
enabled: false
etcd:
replicaCount: 1
pulsar:
enabled: false
minio:
mode: standalone
redis-vector-db:
enabled: false
tei:
enabled: true

image:
repository: opea/dataprep-milvus

port: 6010
# text embedding inference service URL, e.g. http://<service-name>:<port>
#TEI_EMBEDDING_ENDPOINT: "http://embedding-tei:80"
# milvus DB configurations
#MILVUS_HOST: "milvustest"
MILVUS_PORT: "19530"
COLLECTION_NAME: "rag_milvus"
MOSEC_EMBEDDING_ENDPOINT: ""
MOSEC_EMBEDDING_MODEL: ""
9 changes: 9 additions & 0 deletions comps/dataprep/deployment/kubernetes/redis-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

tei:
enabled: true
redis-vector-db:
enabled: true
milvus:
enabled: false
11 changes: 11 additions & 0 deletions comps/embeddings/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy Embedding microservice on Kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Kubernetes

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install embedding-usvc oci://ghcr.io/opea-project/charts/embedding-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
5 changes: 5 additions & 0 deletions comps/embeddings/deployment/kubernetes/cpu-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

tei:
enabled: true
11 changes: 11 additions & 0 deletions comps/guardrails/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy guardrails microservice on Kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Kubernetes

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install guardrails oci://ghcr.io/opea-project/charts/guardrails --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
5 changes: 5 additions & 0 deletions comps/guardrails/deployment/kubernetes/cpu-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

tgi-guardrails:
enabled: true
11 changes: 11 additions & 0 deletions comps/llms/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy LLM microservice on Kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Kubernetes

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install llm oci://ghcr.io/opea-project/charts/llm-uservice --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
9 changes: 9 additions & 0 deletions comps/llms/deployment/kubernetes/cpu-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

tgi:
enabled: true
resources:
requests:
cpu: 100m
memory: 128Mi
11 changes: 11 additions & 0 deletions comps/lvms/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy LVM microservice on Kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Kubernetes

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install lvm oci://ghcr.io/opea-project/charts/lvm-uservice --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
5 changes: 5 additions & 0 deletions comps/lvms/deployment/kubernetes/cpu-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

tgi:
enabled: true
11 changes: 11 additions & 0 deletions comps/prompt_registry/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy prompt microservice on Kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Kubernetes

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install prompt-usvc oci://ghcr.io/opea-project/charts/prompt-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
5 changes: 5 additions & 0 deletions comps/prompt_registry/deployment/kubernetes/cpu-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

mongodb:
enabled: true
11 changes: 11 additions & 0 deletions comps/rerankings/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy reranking microservice on Kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Kubernetes

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install reranking-usvc oci://ghcr.io/opea-project/charts/reranking-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
5 changes: 5 additions & 0 deletions comps/rerankings/deployment/kubernetes/cpu-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

teirerank:
enabled: true
18 changes: 18 additions & 0 deletions comps/retrievers/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
# Deploy retriever microservice on Kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Kubernetes with redis vector DB

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install retriever-usvc oci://ghcr.io/opea-project/charts/retriever-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f redis-values.yaml
```

## Deploy on Kubernetes with milvus vector DB

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install retriever-usvc oci://ghcr.io/opea-project/charts/retriever-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f milvus-values.yaml
```
33 changes: 33 additions & 0 deletions comps/retrievers/deployment/kubernetes/milvus-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# Default values for retriever-usvc.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

milvus:
enabled: true
cluster:
enabled: false
etcd:
replicaCount: 1
pulsar:
enabled: false
minio:
mode: standalone
redis-vector-db:
enabled: false
tei:
enabled: true

image:
repository: opea/retriever-milvus
port: 7000
# text embedding inference service URL, e.g. http://<service-name>:<port>
#TEI_EMBEDDING_ENDPOINT: "http://dataprep-tei:80"
# milvus DB configurations
#MILVUS_HOST: "dataprep-milvus"
MILVUS_PORT: "19530"
COLLECTION_NAME: "rag_milvus"
MOSEC_EMBEDDING_ENDPOINT: ""
MOSEC_EMBEDDING_MODEL: ""
13 changes: 13 additions & 0 deletions comps/retrievers/deployment/kubernetes/redis-values.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

# Default values for retriever-usvc.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

tei:
enabled: true
redis-vector-db:
enabled: true
milvus:
enabled: false
11 changes: 11 additions & 0 deletions comps/third_parties/gpt-sovits/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy gpt-sovits on kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Xeon

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install gpt-sovits oci://ghcr.io/opea-project/charts/gpt-sovits --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

image:
repository: opea/gpt-sovits
11 changes: 11 additions & 0 deletions comps/third_parties/mongodb/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy MongoDB on kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Xeon

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install mongodb oci://ghcr.io/opea-project/charts/mongodb --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
image:
repository: mongo
11 changes: 11 additions & 0 deletions comps/third_parties/nginx/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy nginx on kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Xeon

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install nginx oci://ghcr.io/opea-project/charts/nginx --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

image:
repository: opea/nginx
11 changes: 11 additions & 0 deletions comps/third_parties/redis/deployment/kubernetes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Deploy RedisDB on kubernetes cluster

- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information.
- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme).

## Deploy on Xeon

```
export HFTOKEN="insert-your-huggingface-token-here"
helm install redis-vector-db oci://ghcr.io/opea-project/charts/redis-vector-db --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Copyright (C) 2024 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

image:
repository: redis/redis-stack
Loading
Loading