diff --git a/comps/agent/deployment/kubernetes/README.md b/comps/agent/deployment/kubernetes/README.md index e69de29bb2..158ee40818 100644 --- a/comps/agent/deployment/kubernetes/README.md +++ b/comps/agent/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy Agent microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install agent oci://ghcr.io/opea-project/charts/agent --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-values.yaml +``` diff --git a/comps/agent/deployment/kubernetes/gaudi-values.yaml b/comps/agent/deployment/kubernetes/gaudi-values.yaml new file mode 100644 index 0000000000..91ef5d1026 --- /dev/null +++ b/comps/agent/deployment/kubernetes/gaudi-values.yaml @@ -0,0 +1,38 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +# Accelerate inferencing in heaviest components to improve performance +# by overriding their subchart values + +tgi: + enabled: true + accelDevice: "gaudi" + image: + repository: ghcr.io/huggingface/tgi-gaudi + tag: "2.0.6" + resources: + limits: + habana.ai/gaudi: 4 + MAX_INPUT_LENGTH: "4096" + MAX_TOTAL_TOKENS: "8192" + CUDA_GRAPHS: "" + OMPI_MCA_btl_vader_single_copy_mechanism: "none" + PT_HPU_ENABLE_LAZY_COLLECTIVES: "true" + ENABLE_HPU_GRAPH: "true" + LIMIT_HPU_GRAPH: "true" + USE_FLASH_ATTENTION: "true" + FLASH_ATTENTION_RECOMPUTE: "true" + extraCmdArgs: ["--sharded","true","--num-shard","4"] + livenessProbe: + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 1 + readinessProbe: + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 1 + startupProbe: + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 1 + failureThreshold: 120 diff --git a/comps/asr/deployment/kubernetes/README.md b/comps/asr/deployment/kubernetes/README.md index e69de29bb2..54f5676832 100644 --- a/comps/asr/deployment/kubernetes/README.md +++ b/comps/asr/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy ASR microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install asr oci://ghcr.io/opea-project/charts/asr --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/asr/deployment/kubernetes/cpu-values.yaml b/comps/asr/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..221ea994d5 --- /dev/null +++ b/comps/asr/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +whisper: + enabled: true diff --git a/comps/chathistory/deployment/kubernetes/README.md b/comps/chathistory/deployment/kubernetes/README.md index e69de29bb2..cb105bb7db 100644 --- a/comps/chathistory/deployment/kubernetes/README.md +++ b/comps/chathistory/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy chathistory microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install chathistory-usvc oci://ghcr.io/opea-project/charts/chathistory-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/chathistory/deployment/kubernetes/cpu-values.yaml b/comps/chathistory/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..7850c0ee9d --- /dev/null +++ b/comps/chathistory/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +mongodb: + enabled: true diff --git a/comps/dataprep/deployment/kubernetes/README.md b/comps/dataprep/deployment/kubernetes/README.md new file mode 100644 index 0000000000..fc9d9ab0bf --- /dev/null +++ b/comps/dataprep/deployment/kubernetes/README.md @@ -0,0 +1,18 @@ +# Deploy dataprep microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes with redis VectorDB + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install data-prep oci://ghcr.io/opea-project/charts/data-prep --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f redis-values.yaml +``` + +## Deploy on Kubernetes with milvus VectorDB + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install data-prep oci://ghcr.io/opea-project/charts/data-prep --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f milvus-values.yaml +``` diff --git a/comps/dataprep/deployment/kubernetes/milvus-values.yaml b/comps/dataprep/deployment/kubernetes/milvus-values.yaml new file mode 100644 index 0000000000..e2bc6c243f --- /dev/null +++ b/comps/dataprep/deployment/kubernetes/milvus-values.yaml @@ -0,0 +1,30 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +milvus: + enabled: true + cluster: + enabled: false + etcd: + replicaCount: 1 + pulsar: + enabled: false + minio: + mode: standalone +redis-vector-db: + enabled: false +tei: + enabled: true + +image: + repository: opea/dataprep-milvus + +port: 6010 +# text embedding inference service URL, e.g. http://: +#TEI_EMBEDDING_ENDPOINT: "http://embedding-tei:80" +# milvus DB configurations +#MILVUS_HOST: "milvustest" +MILVUS_PORT: "19530" +COLLECTION_NAME: "rag_milvus" +MOSEC_EMBEDDING_ENDPOINT: "" +MOSEC_EMBEDDING_MODEL: "" diff --git a/comps/dataprep/deployment/kubernetes/redis-values.yaml b/comps/dataprep/deployment/kubernetes/redis-values.yaml new file mode 100644 index 0000000000..54853db043 --- /dev/null +++ b/comps/dataprep/deployment/kubernetes/redis-values.yaml @@ -0,0 +1,9 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +tei: + enabled: true +redis-vector-db: + enabled: true +milvus: + enabled: false diff --git a/comps/embeddings/deployment/kubernetes/README.md b/comps/embeddings/deployment/kubernetes/README.md index e69de29bb2..567987a983 100644 --- a/comps/embeddings/deployment/kubernetes/README.md +++ b/comps/embeddings/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy Embedding microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install embedding-usvc oci://ghcr.io/opea-project/charts/embedding-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/embeddings/deployment/kubernetes/cpu-values.yaml b/comps/embeddings/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..e2d62ff26f --- /dev/null +++ b/comps/embeddings/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +tei: + enabled: true diff --git a/comps/guardrails/deployment/kubernetes/README.md b/comps/guardrails/deployment/kubernetes/README.md index e69de29bb2..b309900a07 100644 --- a/comps/guardrails/deployment/kubernetes/README.md +++ b/comps/guardrails/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy guardrails microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install guardrails oci://ghcr.io/opea-project/charts/guardrails --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/guardrails/deployment/kubernetes/cpu-values.yaml b/comps/guardrails/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..346a39496e --- /dev/null +++ b/comps/guardrails/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +tgi-guardrails: + enabled: true diff --git a/comps/llms/deployment/kubernetes/README.md b/comps/llms/deployment/kubernetes/README.md index e69de29bb2..3c2ee474ba 100644 --- a/comps/llms/deployment/kubernetes/README.md +++ b/comps/llms/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy LLM microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install llm oci://ghcr.io/opea-project/charts/llm-uservice --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/llms/deployment/kubernetes/cpu-values.yaml b/comps/llms/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..a879a49505 --- /dev/null +++ b/comps/llms/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,9 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +tgi: + enabled: true +resources: + requests: + cpu: 100m + memory: 128Mi diff --git a/comps/lvms/deployment/kubernetes/README.md b/comps/lvms/deployment/kubernetes/README.md new file mode 100644 index 0000000000..f8c26af8d5 --- /dev/null +++ b/comps/lvms/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy LVM microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install lvm oci://ghcr.io/opea-project/charts/lvm-uservice --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/lvms/deployment/kubernetes/cpu-values.yaml b/comps/lvms/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..3de5b26fce --- /dev/null +++ b/comps/lvms/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +tgi: + enabled: true diff --git a/comps/prompt_registry/deployment/kubernetes/README.md b/comps/prompt_registry/deployment/kubernetes/README.md index e69de29bb2..387197ea76 100644 --- a/comps/prompt_registry/deployment/kubernetes/README.md +++ b/comps/prompt_registry/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy prompt microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install prompt-usvc oci://ghcr.io/opea-project/charts/prompt-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/prompt_registry/deployment/kubernetes/cpu-values.yaml b/comps/prompt_registry/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..7850c0ee9d --- /dev/null +++ b/comps/prompt_registry/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +mongodb: + enabled: true diff --git a/comps/rerankings/deployment/kubernetes/README.md b/comps/rerankings/deployment/kubernetes/README.md index e69de29bb2..23bf0ef425 100644 --- a/comps/rerankings/deployment/kubernetes/README.md +++ b/comps/rerankings/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy reranking microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install reranking-usvc oci://ghcr.io/opea-project/charts/reranking-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/rerankings/deployment/kubernetes/cpu-values.yaml b/comps/rerankings/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..f16bb56416 --- /dev/null +++ b/comps/rerankings/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +teirerank: + enabled: true diff --git a/comps/retrievers/deployment/kubernetes/README.md b/comps/retrievers/deployment/kubernetes/README.md new file mode 100644 index 0000000000..141d49f05a --- /dev/null +++ b/comps/retrievers/deployment/kubernetes/README.md @@ -0,0 +1,18 @@ +# Deploy retriever microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes with redis vector DB + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install retriever-usvc oci://ghcr.io/opea-project/charts/retriever-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f redis-values.yaml +``` + +## Deploy on Kubernetes with milvus vector DB + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install retriever-usvc oci://ghcr.io/opea-project/charts/retriever-usvc --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f milvus-values.yaml +``` diff --git a/comps/retrievers/deployment/kubernetes/milvus-values.yaml b/comps/retrievers/deployment/kubernetes/milvus-values.yaml new file mode 100644 index 0000000000..c186b4be2c --- /dev/null +++ b/comps/retrievers/deployment/kubernetes/milvus-values.yaml @@ -0,0 +1,33 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +# Default values for retriever-usvc. +# This is a YAML-formatted file. +# Declare variables to be passed into your templates. + +milvus: + enabled: true + cluster: + enabled: false + etcd: + replicaCount: 1 + pulsar: + enabled: false + minio: + mode: standalone +redis-vector-db: + enabled: false +tei: + enabled: true + +image: + repository: opea/retriever-milvus +port: 7000 +# text embedding inference service URL, e.g. http://: +#TEI_EMBEDDING_ENDPOINT: "http://dataprep-tei:80" +# milvus DB configurations +#MILVUS_HOST: "dataprep-milvus" +MILVUS_PORT: "19530" +COLLECTION_NAME: "rag_milvus" +MOSEC_EMBEDDING_ENDPOINT: "" +MOSEC_EMBEDDING_MODEL: "" diff --git a/comps/retrievers/deployment/kubernetes/redis-values.yaml b/comps/retrievers/deployment/kubernetes/redis-values.yaml new file mode 100644 index 0000000000..cbc29c7eeb --- /dev/null +++ b/comps/retrievers/deployment/kubernetes/redis-values.yaml @@ -0,0 +1,13 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +# Default values for retriever-usvc. +# This is a YAML-formatted file. +# Declare variables to be passed into your templates. + +tei: + enabled: true +redis-vector-db: + enabled: true +milvus: + enabled: false diff --git a/comps/third_parties/gpt-sovits/deployment/kubernetes/README.md b/comps/third_parties/gpt-sovits/deployment/kubernetes/README.md new file mode 100644 index 0000000000..3a9f77f86e --- /dev/null +++ b/comps/third_parties/gpt-sovits/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy gpt-sovits on kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Xeon + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install gpt-sovits oci://ghcr.io/opea-project/charts/gpt-sovits --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/third_parties/gpt-sovits/deployment/kubernetes/cpu-values.yaml b/comps/third_parties/gpt-sovits/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..087e8b3346 --- /dev/null +++ b/comps/third_parties/gpt-sovits/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +image: + repository: opea/gpt-sovits diff --git a/comps/third_parties/mongodb/deployment/kubernetes/README.md b/comps/third_parties/mongodb/deployment/kubernetes/README.md new file mode 100644 index 0000000000..a9c5db7d1e --- /dev/null +++ b/comps/third_parties/mongodb/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy MongoDB on kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Xeon + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install mongodb oci://ghcr.io/opea-project/charts/mongodb --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/third_parties/mongodb/deployment/kubernetes/cpu-values.yaml b/comps/third_parties/mongodb/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..4d81053189 --- /dev/null +++ b/comps/third_parties/mongodb/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,4 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 +image: + repository: mongo diff --git a/comps/third_parties/nginx/deployment/kubernetes/README.md b/comps/third_parties/nginx/deployment/kubernetes/README.md index e69de29bb2..a96d744db8 100644 --- a/comps/third_parties/nginx/deployment/kubernetes/README.md +++ b/comps/third_parties/nginx/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy nginx on kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Xeon + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install nginx oci://ghcr.io/opea-project/charts/nginx --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/third_parties/nginx/deployment/kubernetes/cpu-values.yaml b/comps/third_parties/nginx/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..98e8182d2c --- /dev/null +++ b/comps/third_parties/nginx/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +image: + repository: opea/nginx diff --git a/comps/third_parties/redis/deployment/kubernetes/README.md b/comps/third_parties/redis/deployment/kubernetes/README.md new file mode 100644 index 0000000000..ab8cdc06c4 --- /dev/null +++ b/comps/third_parties/redis/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy RedisDB on kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Xeon + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install redis-vector-db oci://ghcr.io/opea-project/charts/redis-vector-db --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/third_parties/redis/deployment/kubernetes/cpu-values.yaml b/comps/third_parties/redis/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..415b0aee8b --- /dev/null +++ b/comps/third_parties/redis/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +image: + repository: redis/redis-stack diff --git a/comps/third_parties/speecht5/deployment/kubernetes/README.md b/comps/third_parties/speecht5/deployment/kubernetes/README.md new file mode 100644 index 0000000000..e0f18a3f7d --- /dev/null +++ b/comps/third_parties/speecht5/deployment/kubernetes/README.md @@ -0,0 +1,18 @@ +# Deploy speecht5 on kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Xeon + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install speecht5 oci://ghcr.io/opea-project/charts/speecht5 --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` + +## Deploy on Gaudi + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install speecht5 oci://ghcr.io/opea-project/charts/speecht5 --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-values.yaml +``` diff --git a/comps/third_parties/speecht5/deployment/kubernetes/cpu-values.yaml b/comps/third_parties/speecht5/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..56e0cd0cdc --- /dev/null +++ b/comps/third_parties/speecht5/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +image: + repository: opea/speecht5 diff --git a/comps/third_parties/speecht5/deployment/kubernetes/gaudi-values.yaml b/comps/third_parties/speecht5/deployment/kubernetes/gaudi-values.yaml new file mode 100644 index 0000000000..c7e5295bd9 --- /dev/null +++ b/comps/third_parties/speecht5/deployment/kubernetes/gaudi-values.yaml @@ -0,0 +1,8 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +image: + repository: opea/speecht5-gaudi +resources: + limits: + habana.ai/gaudi: 1 diff --git a/comps/third_parties/tei/deployment/kubernetes/README.md b/comps/third_parties/tei/deployment/kubernetes/README.md new file mode 100644 index 0000000000..1650330214 --- /dev/null +++ b/comps/third_parties/tei/deployment/kubernetes/README.md @@ -0,0 +1,18 @@ +# Deploy TEI on kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Xeon + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install tei oci://ghcr.io/opea-project/charts/tei --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` + +## Deploy on Gaudi + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install tei oci://ghcr.io/opea-project/charts/tei --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-values.yaml +``` diff --git a/comps/third_parties/tei/deployment/kubernetes/cpu-values.yaml b/comps/third_parties/tei/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..5eaa0d2744 --- /dev/null +++ b/comps/third_parties/tei/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +image: + repository: ghcr.io/huggingface/text-embeddings-inference diff --git a/comps/third_parties/tei/deployment/kubernetes/gaudi-values.yaml b/comps/third_parties/tei/deployment/kubernetes/gaudi-values.yaml new file mode 100644 index 0000000000..aa8c36da48 --- /dev/null +++ b/comps/third_parties/tei/deployment/kubernetes/gaudi-values.yaml @@ -0,0 +1,22 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +accelDevice: "gaudi" + +OMPI_MCA_btl_vader_single_copy_mechanism: "none" +MAX_WARMUP_SEQUENCE_LENGTH: "512" +image: + repository: ghcr.io/huggingface/tei-gaudi + tag: 1.5.0 + +securityContext: + readOnlyRootFilesystem: false + +resources: + limits: + habana.ai/gaudi: 1 + +livenessProbe: + timeoutSeconds: 1 +readinessProbe: + timeoutSeconds: 1 diff --git a/comps/third_parties/teirerank/deployment/kubernetes/README.md b/comps/third_parties/teirerank/deployment/kubernetes/README.md new file mode 100644 index 0000000000..b67de89cb0 --- /dev/null +++ b/comps/third_parties/teirerank/deployment/kubernetes/README.md @@ -0,0 +1,18 @@ +# Deploy TEIRERANK on kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Xeon + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install teirerank oci://ghcr.io/opea-project/charts/teirerank --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` + +## Deploy on Gaudi + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install teirerank oci://ghcr.io/opea-project/charts/teirerank --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-values.yaml +``` diff --git a/comps/third_parties/teirerank/deployment/kubernetes/cpu-values.yaml b/comps/third_parties/teirerank/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..5eaa0d2744 --- /dev/null +++ b/comps/third_parties/teirerank/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +image: + repository: ghcr.io/huggingface/text-embeddings-inference diff --git a/comps/third_parties/teirerank/deployment/kubernetes/gaudi-values.yaml b/comps/third_parties/teirerank/deployment/kubernetes/gaudi-values.yaml new file mode 100644 index 0000000000..aa8c36da48 --- /dev/null +++ b/comps/third_parties/teirerank/deployment/kubernetes/gaudi-values.yaml @@ -0,0 +1,22 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +accelDevice: "gaudi" + +OMPI_MCA_btl_vader_single_copy_mechanism: "none" +MAX_WARMUP_SEQUENCE_LENGTH: "512" +image: + repository: ghcr.io/huggingface/tei-gaudi + tag: 1.5.0 + +securityContext: + readOnlyRootFilesystem: false + +resources: + limits: + habana.ai/gaudi: 1 + +livenessProbe: + timeoutSeconds: 1 +readinessProbe: + timeoutSeconds: 1 diff --git a/comps/third_parties/tgi/deployment/kubernetes/README.md b/comps/third_parties/tgi/deployment/kubernetes/README.md index e69de29bb2..ff37f88ecf 100644 --- a/comps/third_parties/tgi/deployment/kubernetes/README.md +++ b/comps/third_parties/tgi/deployment/kubernetes/README.md @@ -0,0 +1,18 @@ +# Deploy TGI on kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Xeon + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install tgi oci://ghcr.io/opea-project/charts/tgi --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` + +## Deploy on Gaudi + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install tgi oci://ghcr.io/opea-project/charts/tgi --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-values.yaml +``` diff --git a/comps/third_parties/tgi/deployment/kubernetes/cpu-values.yaml b/comps/third_parties/tgi/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..38297ab3d3 --- /dev/null +++ b/comps/third_parties/tgi/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,26 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +# Resource requirements for Intel/neural-chat-7b-v3-3 @ 32-bit: +resources: + limits: + cpu: 8 + memory: 70Gi + requests: + cpu: 6 + memory: 65Gi + +livenessProbe: + initialDelaySeconds: 8 + periodSeconds: 8 + failureThreshold: 24 + timeoutSeconds: 4 +readinessProbe: + initialDelaySeconds: 16 + periodSeconds: 8 + timeoutSeconds: 4 +startupProbe: + initialDelaySeconds: 10 + periodSeconds: 5 + failureThreshold: 180 + timeoutSeconds: 2 diff --git a/comps/third_parties/tgi/deployment/kubernetes/gaudi-values.yaml b/comps/third_parties/tgi/deployment/kubernetes/gaudi-values.yaml new file mode 100644 index 0000000000..8e04769aec --- /dev/null +++ b/comps/third_parties/tgi/deployment/kubernetes/gaudi-values.yaml @@ -0,0 +1,38 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +accelDevice: "gaudi" + +image: + repository: ghcr.io/huggingface/tgi-gaudi + tag: "2.0.6" + +MAX_INPUT_LENGTH: "1024" +MAX_TOTAL_TOKENS: "2048" +CUDA_GRAPHS: "" +OMPI_MCA_btl_vader_single_copy_mechanism: "none" +ENABLE_HPU_GRAPH: "true" +LIMIT_HPU_GRAPH: "true" +USE_FLASH_ATTENTION: "true" +FLASH_ATTENTION_RECOMPUTE: "true" + +resources: + limits: + habana.ai/gaudi: 1 + requests: + cpu: 1 + memory: 16Gi + +livenessProbe: + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 1 +readinessProbe: + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 1 +startupProbe: + initialDelaySeconds: 5 + periodSeconds: 5 + timeoutSeconds: 1 + failureThreshold: 120 diff --git a/comps/third_parties/vllm/deployment/kubernetes/README.md b/comps/third_parties/vllm/deployment/kubernetes/README.md index e69de29bb2..18b17d9096 100644 --- a/comps/third_parties/vllm/deployment/kubernetes/README.md +++ b/comps/third_parties/vllm/deployment/kubernetes/README.md @@ -0,0 +1,18 @@ +# Deploy vllm on kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Xeon + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install myvllm oci://ghcr.io/opea-project/charts/vllm --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` + +## Deploy on Gaudi + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install myvllm oci://ghcr.io/opea-project/charts/vllm --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-values.yaml +``` diff --git a/comps/third_parties/vllm/deployment/kubernetes/cpu-values.yaml b/comps/third_parties/vllm/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..c2e01e4be7 --- /dev/null +++ b/comps/third_parties/vllm/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +image: + repository: opea/vllm diff --git a/comps/third_parties/vllm/deployment/kubernetes/gaudi-values.yaml b/comps/third_parties/vllm/deployment/kubernetes/gaudi-values.yaml new file mode 100644 index 0000000000..e9ddbed829 --- /dev/null +++ b/comps/third_parties/vllm/deployment/kubernetes/gaudi-values.yaml @@ -0,0 +1,14 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +accelDevice: "gaudi" + +image: + repository: opea/vllm-gaudi + +# VLLM_CPU_KVCACHE_SPACE: "40" +OMPI_MCA_btl_vader_single_copy_mechanism: none +extraCmdArgs: ["--tensor-parallel-size","1","--block-size","128","--max-num-seqs","256","--max-seq_len-to-capture","2048"] +resources: + limits: + habana.ai/gaudi: 1 diff --git a/comps/third_parties/whisper/deployment/kubernetes/README.md b/comps/third_parties/whisper/deployment/kubernetes/README.md new file mode 100644 index 0000000000..3754916482 --- /dev/null +++ b/comps/third_parties/whisper/deployment/kubernetes/README.md @@ -0,0 +1,18 @@ +# Deploy whisper on kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Xeon + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install whisper oci://ghcr.io/opea-project/charts/whisper --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` + +## Deploy on Gaudi + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install whisper oci://ghcr.io/opea-project/charts/whisper --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-values.yaml +``` diff --git a/comps/third_parties/whisper/deployment/kubernetes/cpu-values.yaml b/comps/third_parties/whisper/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..f32f55f00f --- /dev/null +++ b/comps/third_parties/whisper/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +image: + repository: opea/whisper diff --git a/comps/third_parties/whisper/deployment/kubernetes/gaudi-values.yaml b/comps/third_parties/whisper/deployment/kubernetes/gaudi-values.yaml new file mode 100644 index 0000000000..3ba40c4b8d --- /dev/null +++ b/comps/third_parties/whisper/deployment/kubernetes/gaudi-values.yaml @@ -0,0 +1,9 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +image: + repository: opea/whisper-gaudi + +resources: + limits: + habana.ai/gaudi: 1 diff --git a/comps/tts/deployment/kubernetes/README.md b/comps/tts/deployment/kubernetes/README.md new file mode 100644 index 0000000000..af1bcb05a3 --- /dev/null +++ b/comps/tts/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy tts microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install tts oci://ghcr.io/opea-project/charts/tts --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/tts/deployment/kubernetes/cpu-values.yaml b/comps/tts/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..c735ab48ab --- /dev/null +++ b/comps/tts/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +speecht5: + enabled: true diff --git a/comps/web_retrievers/deployment/kubernetes/README.md b/comps/web_retrievers/deployment/kubernetes/README.md new file mode 100644 index 0000000000..c361509fe8 --- /dev/null +++ b/comps/web_retrievers/deployment/kubernetes/README.md @@ -0,0 +1,11 @@ +# Deploy web-retriever microservice on Kubernetes cluster + +- You should have Helm (version >= 3.15) installed. Refer to the [Helm Installation Guide](https://helm.sh/docs/intro/install/) for more information. +- For more deployment options, refer to [helm charts README](https://github.com/opea-project/GenAIInfra/tree/main/helm-charts#readme). + +## Deploy on Kubernetes + +``` +export HFTOKEN="insert-your-huggingface-token-here" +helm install web-retriever oci://ghcr.io/opea-project/charts/web-retriever --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml +``` diff --git a/comps/web_retrievers/deployment/kubernetes/cpu-values.yaml b/comps/web_retrievers/deployment/kubernetes/cpu-values.yaml new file mode 100644 index 0000000000..e2d62ff26f --- /dev/null +++ b/comps/web_retrievers/deployment/kubernetes/cpu-values.yaml @@ -0,0 +1,5 @@ +# Copyright (C) 2024 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +tei: + enabled: true