From 1ec71a5a617b540db67747bf8f9f8e527850514a Mon Sep 17 00:00:00 2001
From: alexsin368 <109180236+alexsin368@users.noreply.github.com>
Date: Mon, 18 Nov 2024 18:10:33 -0800
Subject: [PATCH 1/5] Add codegen sample guide for Gaudi deployment (#248)

* adding codegen sample guide for gaudi deployment

Signed-off-by: alexsin368 <alex.sin@intel.com>

* update with TAG instead of version number

Signed-off-by: alexsin368 <alex.sin@intel.com>

* remove mention of vllm

Signed-off-by: alexsin368 <alex.sin@intel.com>

* adding codegen sample guide for gaudi deployment

Signed-off-by: alexsin368 <alex.sin@intel.com>

* update with TAG instead of version number

Signed-off-by: alexsin368 <alex.sin@intel.com>

* remove mention of vllm

Signed-off-by: alexsin368 <alex.sin@intel.com>

* fix typos, add link to ITAC, attempt to fix IO flow diagram link issue

Signed-off-by: alexsin368 <alex.sin@intel.com>

* modify and add index.rst files

Signed-off-by: alexsin368 <alex.sin@intel.com>

* make text output to not think it's a hyperlink

Signed-off-by: alexsin368 <alex.sin@intel.com>

---------

Signed-off-by: alexsin368 <alex.sin@intel.com>
Co-authored-by: Ying Hu <ying.hu@intel.com>
---
 examples/CodeGen/deploy/gaudi.md  | 375 ++++++++++++++++++++++++++++++
 examples/CodeGen/deploy/index.rst |  14 ++
 examples/index.rst                |   2 +
 3 files changed, 391 insertions(+)
 create mode 100644 examples/CodeGen/deploy/gaudi.md
 create mode 100644 examples/CodeGen/deploy/index.rst

diff --git a/examples/CodeGen/deploy/gaudi.md b/examples/CodeGen/deploy/gaudi.md
new file mode 100644
index 00000000..d4dfafa9
--- /dev/null
+++ b/examples/CodeGen/deploy/gaudi.md
@@ -0,0 +1,375 @@
+# Single node on-prem deployment with TGI on Gaudi AI Accelerator
+
+This deployment section covers single-node on-prem deployment of the CodeGen
+example with OPEA comps to deploy using the TGI service. We will be showcasing how
+to build an e2e CodeGen solution with the CodeLlama-7b-hf model, deployed on Intel® 
+Tiber™ AI Cloud ([ITAC](https://www.intel.com/content/www/us/en/developer/tools/tiber/ai-cloud.html)). 
+To quickly learn about OPEA in just 5 minutes and set up the required hardware and software, 
+please follow the instructions in the [Getting Started](https://opea-project.github.io/latest/getting-started/README.html) 
+section. If you do not have an ITAC instance or the hardware is not supported in the ITAC yet, you can still run this on-prem. 
+
+## Overview
+
+The CodeGen use case uses a single microservice called LLM. In this tutorial, we 
+will walk through the steps on how to enable it from OPEA GenAIComps to deploy on 
+a single node TGI megaservice solution. 
+
+The solution is aimed to show how to use the CodeLlama-7b-hf model on the Intel® 
+Gaudi® AI Accelerator. We will go through how to setup docker containers to start 
+the microservice and megaservice. The solution will then take text input as the 
+prompt and generate code accordingly. It is deployed with a UI with 2 modes to 
+choose from:
+
+1. Svelte-Based UI
+2. React-Based UI
+
+The React-based UI is optional, but this feature is supported in this example if you
+are interested in using it.
+
+Below is the list of content we will be covering in this tutorial:
+
+1. Prerequisites
+2. Prepare (Building / Pulling) Docker images
+3. Use case setup
+4. Deploy the use case
+5. Interacting with CodeGen deployment
+
+## Prerequisites
+
+The first step is to clone the GenAIExamples and GenAIComps. GenAIComps are
+fundamental necessary components used to build examples you find in
+GenAIExamples and deploy them as microservices. Also set the `TAG` 
+environment variable with the version. 
+
+```bash
+git clone https://github.com/opea-project/GenAIComps.git
+git clone https://github.com/opea-project/GenAIExamples.git
+export TAG=1.1
+```
+
+The examples utilize model weights from HuggingFace and langchain.
+
+Setup your [HuggingFace](https://huggingface.co/) account and generate
+[user access token](https://huggingface.co/docs/transformers.js/en/guides/private#step-1-generating-a-user-access-token).
+
+Setup the HuggingFace token
+```
+export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token"
+```
+
+Additionally, if you plan to use the default model CodeLlama-7b-hf, you will 
+need to [request access](https://huggingface.co/meta-llama/CodeLlama-7b-hf) from HuggingFace.
+
+The example requires you to set the `host_ip` to deploy the microservices on
+endpoint enabled with ports. Set the host_ip env variable
+```
+export host_ip=$(hostname -I | awk '{print $1}')
+```
+
+Make sure to setup Proxies if you are behind a firewall
+```
+export no_proxy=${your_no_proxy},$host_ip
+export http_proxy=${your_http_proxy}
+export https_proxy=${your_http_proxy}
+```
+
+## Prepare (Building / Pulling) Docker images
+
+This step will involve building/pulling relevant docker
+images with step-by-step process along with sanity check in the end. For
+CodeGen, the following docker images will be needed: LLM with TGI. 
+Additionally, you will need to build docker images for the 
+CodeGen megaservice, and UI (React UI is optional). In total,
+there are **3 required docker images** and an optional docker image.
+
+### Build/Pull Microservice image
+
+::::::{tab-set}
+
+:::::{tab-item} Pull
+:sync: Pull
+
+If you decide to pull the docker containers and not build them locally,
+you can proceed to the next step where all the necessary containers will
+be pulled in from dockerhub.
+
+:::::
+:::::{tab-item} Build
+:sync: Build
+
+From within the `GenAIComps` folder, checkout the release tag.
+```
+cd GenAIComps
+git checkout tags/v${TAG}
+```
+
+#### Build LLM Image
+
+```bash
+docker build --no-cache -t opea/llm-tgi:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/text-generation/tgi/Dockerfile .
+```
+
+### Build Mega Service images
+
+The Megaservice is a pipeline that channels data through different
+microservices, each performing varied tasks. The LLM microservice and 
+flow of data are defined in the `codegen.py` file. You can also add or 
+remove microservices and customize the megaservice to suit your needs.
+
+Build the megaservice image for this use case
+
+```bash
+cd ..
+cd GenAIExamples
+git checkout tags/v${TAG}
+cd CodeGen
+```
+
+```bash
+docker build --no-cache -t opea/codegen:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile .
+cd ../..
+```
+
+### Build the UI Image
+
+You can build 2 modes of UI
+
+*Svelte UI*
+
+```bash
+cd GenAIExamples/CodeGen/ui/
+docker build --no-cache -t opea/codegen-ui:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile .
+cd ../../..
+```
+
+*React UI (Optional)* 
+If you want a React-based frontend.
+
+```bash
+cd GenAIExamples/CodeGen/ui/
+docker build --no-cache -t opea/codegen-react-ui:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile.react .
+cd ../../..
+```
+
+### Sanity Check
+Check if you have the following set of docker images by running the command `docker images` before moving on to the next step. 
+The tags are based on what you set the environment variable `TAG` to. 
+
+* `opea/llm-tgi:${TAG}`
+* `opea/codegen:${TAG}`
+* `opea/codegen-ui:${TAG}`
+* `opea/codegen-react-ui:${TAG}` (optional)
+
+:::::
+::::::
+
+## Use Case Setup
+
+The use case will use the following combination of GenAIComps and tools
+
+|Use Case Components | Tools | Model     | Service Type |
+|----------------     |--------------|-----------------------------|-------|
+|LLM                  |   TGI        | meta-llama/CodeLlama-7b-hf | OPEA Microservice |
+|UI                   |              | NA                        | Gateway Service |
+
+Tools and models mentioned in the table are configurable either through the
+environment variables or `compose.yaml` file.
+
+Set the necessary environment variables to setup the use case by running the `set_env.sh` script.
+Here is where the environment variable `LLM_MODEL_ID` is set, and you can change it to another model 
+by specifying the HuggingFace model card ID.
+
+```bash
+cd GenAIExamples/CodeGen/docker_compose/
+source ./set_env.sh
+cd ../../..
+```
+
+## Deploy the Use Case
+
+In this tutorial, we will be deploying via docker compose with the provided
+YAML file.  The docker compose instructions should be starting all the
+above mentioned services as containers.
+
+```bash
+cd GenAIExamples/CodeGen/docker_compose/intel/hpu/gaudi
+docker compose up -d
+```
+
+
+### Checks to Ensure the Services are Running
+#### Check Startup and Env Variables
+Check the startup log by running `docker compose logs` to ensure there are no errors.
+The warning messages print out the variables if they are **NOT** set.
+
+Here are some sample messages if proxy environment variables are not set:
+
+    WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string.
+    WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string.
+    WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string.
+    WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string.
+    WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string.
+    WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string.
+    WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string.
+    WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string.
+    WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string.
+    WARN[0000] The "no_proxy" variable is not set. Defaulting to a blank string.
+    WARN[0000] The "http_proxy" variable is not set. Defaulting to a blank string.
+    WARN[0000] The "https_proxy" variable is not set. Defaulting to a blank string.
+
+#### Check the Container Status
+Check if all the containers launched via docker compose have started.
+
+The CodeGen example starts 4 docker containers. Check that these docker
+containers are all running, i.e, all the containers  `STATUS`  are  `Up`.
+You can do this with the `docker ps -a` command.
+
+```
+CONTAINER ID   IMAGE                                                   COMMAND                  CREATED              STATUS              PORTS                                       NAMES
+bbd235074c3d   opea/codegen-ui:${TAG}                                  "docker-entrypoint.s…"   About a minute ago   Up About a minute   0.0.0.0:5173->5173/tcp, :::5173->5173/tcp   codegen-gaudi-ui-server
+8d3872ca66fa   opea/codegen:${TAG}                                     "python codegen.py"      About a minute ago   Up About a minute   0.0.0.0:7778->7778/tcp, :::7778->7778/tcp   codegen-gaudi-backend-server
+b9fc39f51cdb   opea/llm-tgi:${TAG}                                     "bash entrypoint.sh"     About a minute ago   Up About a minute   0.0.0.0:9000->9000/tcp, :::9000->9000/tcp   llm-tgi-gaudi-server
+39994e007f15   ghcr.io/huggingface/tgi-gaudi:2.0.1                     "text-generation-lau…"   About a minute ago   Up About a minute   0.0.0.0:8028->80/tcp, :::8028->80/tcp       tgi-gaudi-server
+```
+
+## Interacting with CodeGen for Deployment
+
+This section will walk you through the different ways to interact with
+the microservices deployed. After a couple minutes, rerun `docker ps -a` 
+to ensure all the docker containers are still up and running. Then proceed 
+to validate each microservice and megaservice. 
+
+### TGI Service
+
+```bash
+curl http://${host_ip}:8028/generate \
+  -X POST \
+  -d '{"inputs":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","parameters":{"max_new_tokens":256, "do_sample": true}}' \
+  -H 'Content-Type: application/json'
+```
+
+Here is the output:
+
+```
+{"generated_text":"\n\nIO iflow diagram:\n\n!\[IO flow diagram(s)\]\(TodoList.iflow.svg\)\n\n### TDD Kata walkthrough\n\n1. Start with a user story. We will add story tests later. In this case, we'll choose a story about adding a TODO:\n    ```ruby\n    as a user,\n    i want to add a todo,\n    so that i can get a todo list.\n\n    conformance:\n    - a new todo is added to the list\n    - if the todo text is empty, raise an exception\n    ```\n\n1. Write the first test:\n    ```ruby\n    feature Testing the addition of a todo to the list\n\n    given a todo list empty list\n    when a user adds a todo\n    the todo should be added to the list\n\n    inputs:\n    when_values: [[\"A\"]]\n\n    output validations:\n    - todo_list contains { text:\"A\" }\n    ```\n\n1. Write the first step implementation in any programming language you like. In this case, we will choose Ruby:\n    ```ruby\n    def add_"}
+```
+
+### LLM Microservice
+
+```bash
+curl http://${host_ip}:9000/v1/chat/completions\
+  -X POST \
+  -d '{"query":"Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception.","max_tokens":256,"top_k":10,"top_p":0.95,"typical_p":0.95,"temperature":0.01,"repetition_penalty":1.03,"streaming":true}' \
+  -H 'Content-Type: application/json'
+```
+
+The output is given one character at a time. It is too long to show 
+here but the last item will be
+```
+data: [DONE]
+```
+
+### MegaService
+
+```bash
+curl http://${host_ip}:7778/v1/codegen -H "Content-Type: application/json" -d '{
+     "messages": "Implement a high-level API for a TODO list application. The API takes as input an operation request and updates the TODO list in place. If the request is invalid, raise an exception."
+     }'
+```
+
+The output is given one character at a time. It is too long to show 
+here but the last item will be
+```
+data: [DONE]
+```
+
+## Launch UI
+### Svelte UI
+To access the frontend, open the following URL in your browser: http://{host_ip}:5173. By default, the UI runs on port 5173 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below:
+```bash
+  codegen-gaudi-ui-server:
+    image: ${REGISTRY:-opea}/codegen-ui:${TAG:-latest}
+    ...
+    ports:
+      - "5173:5173"
+```
+
+### React-Based UI (Optional)
+To access the React-based frontend, modify the UI service in the `compose.yaml` file. Replace `codegen-gaudi-ui-server` service with the codegen-gaudi-react-ui-server service as per the config below:
+```bash
+codegen-gaudi-react-ui-server:
+  image: ${REGISTRY:-opea}/codegen-react-ui:${TAG:-latest}
+  container_name: codegen-gaudi-react-ui-server
+  environment:
+    - no_proxy=${no_proxy}
+    - https_proxy=${https_proxy}
+    - http_proxy=${http_proxy}
+    - APP_CODE_GEN_URL=${BACKEND_SERVICE_ENDPOINT}
+  depends_on:
+    - codegen-gaudi-backend-server
+  ports:
+    - "5174:80"
+  ipc: host
+  restart: always
+```
+Once the services are up, open the following URL in your browser: http://{host_ip}:5174. By default, the UI runs on port 80 internally. If you prefer to use a different host port to access the frontend, you can modify the port mapping in the `compose.yaml` file as shown below:
+```bash
+  codegen-gaudi-react-ui-server:
+    image: ${REGISTRY:-opea}/codegen-react-ui:${TAG:-latest}
+    ...
+    ports:
+      - "80:80"
+```
+
+## Check Docker Container Logs
+
+You can check the log of a container by running this command:
+
+```bash
+docker logs <CONTAINER ID> -t
+```
+
+You can also check the overall logs with the following command, where the
+`compose.yaml` is the megaservice docker-compose configuration file.
+
+Assumming you are still in this directory `GenAIExamples/CodeGen/docker_compose/intel/hpu/gaudi`,
+run the following command to check the logs:
+```bash
+docker compose -f compose.yaml logs
+```
+
+View the docker input parameters in  `./CodeGen/docker_compose/intel/hpu/gaudi/compose.yaml`
+
+```yaml
+  tgi-service:
+    image: ghcr.io/huggingface/tgi-gaudi:2.0.1
+    container_name: tgi-gaudi-server
+    ports:
+      - "8028:80"
+    volumes:
+      - "./data:/data"
+    environment:
+      no_proxy: ${no_proxy}
+      http_proxy: ${http_proxy}
+      https_proxy: ${https_proxy}
+      HABANA_VISIBLE_DEVICES: all
+      OMPI_MCA_btl_vader_single_copy_mechanism: none
+      HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
+    runtime: habana
+    cap_add:
+      - SYS_NICE
+    ipc: host
+    command: --model-id ${LLM_MODEL_ID} --max-input-length 1024 --max-total-tokens 2048
+```
+
+The input `--model-id` is  `${LLM_MODEL_ID}`. Ensure the environment variable `LLM_MODEL_ID` 
+is set and spelled correctly. Check spelling. Whenever this is changed, restart the containers to use 
+the newly selected model.
+
+
+## Stop the services
+
+Once you are done with the entire pipeline and wish to stop and remove all the containers, use the command below:
+```
+docker compose down
+```
diff --git a/examples/CodeGen/deploy/index.rst b/examples/CodeGen/deploy/index.rst
new file mode 100644
index 00000000..ac0a37d0
--- /dev/null
+++ b/examples/CodeGen/deploy/index.rst
@@ -0,0 +1,14 @@
+.. _codegen-example-deployment:
+
+CodeGen Example Deployment Options
+###################################
+
+Here are some deployment options, depending on your hardware and environment:
+
+Single Node
+***********
+
+.. toctree::
+   :maxdepth: 1
+
+   Gaudi AI Accelerator <gaudi>
\ No newline at end of file
diff --git a/examples/index.rst b/examples/index.rst
index 6524dc1d..283693f3 100644
--- a/examples/index.rst
+++ b/examples/index.rst
@@ -12,6 +12,8 @@ GenAIExamples are designed to give developers an easy entry into generative AI,
    ChatQnA/deploy/index
    AgentQnA/AgentQnA_Guide
    AgentQnA/deploy/index
+   CodeGen/deploy/gaudi.md
+   CodeGen/deploy/index
 
 ----
 

From 112db286640b7d08629db6eb5efa7467beddb3db Mon Sep 17 00:00:00 2001
From: Ying Hu <ying.hu@intel.com>
Date: Tue, 19 Nov 2024 11:52:44 +0800
Subject: [PATCH 2/5] Update index.rst for AgentQnA doc refactor (#250)

* Update index.rst for AgentQnA doc refactor

Update index.rst for AgentQnA doc refactor

* Update index.rst for

* Update index.rst for AgentQnA doc refactor

* Update index.rst

fix the rst ref link

* Update index.rst

* Update index.rst

* Update AgentQnA_Guide.rst

* Update AgentQnA_Guide.rst

* Update AgentQnA_Guide.rst

* Update AgentQnA_Guide.rst

* Update AgentQnA_Guide.rst

* Update index.rst

* Update AgentQnA_Guide.rst

fix the format

* Update index.rst

* Update index.rst

* Update AgentQnA_Guide.rst

* Update AgentQnA_Guide.rst

* Update AgentQnA_Guide.rst

* Update examples/AgentQnA/AgentQnA_Guide.rst

* Update conf.py

* Update AgentQnA_Guide.rst
---
 conf.py                              | 2 ++
 examples/AgentQnA/AgentQnA_Guide.rst | 9 ++++++++-
 examples/AgentQnA/deploy/index.rst   | 7 ++-----
 examples/index.rst                   | 1 -
 4 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/conf.py b/conf.py
index 35051272..5ab3c491 100644
--- a/conf.py
+++ b/conf.py
@@ -70,6 +70,8 @@
 # files and directories to ignore when looking for source files.
 exclude_patterns = [
         'scripts/*',
+        'examples/AgentQnA/deploy/index.rst',
+        'examples/AgentQnA/deploy/xeon.md'
         ]
 try:
     import sphinx_rtd_theme
diff --git a/examples/AgentQnA/AgentQnA_Guide.rst b/examples/AgentQnA/AgentQnA_Guide.rst
index c8424071..a4889928 100644
--- a/examples/AgentQnA/AgentQnA_Guide.rst
+++ b/examples/AgentQnA/AgentQnA_Guide.rst
@@ -43,5 +43,12 @@ The worker agent uses the retrieval tool to generate answers to the queries post
 
 Deployment
 **********
+Here are some deployment options, depending on your hardware and environment:
 
-See the :ref:`agentqna-example-deployment`.
\ No newline at end of file
+Single Node
++++++++++++++++
+.. toctree::
+   :maxdepth: 1
+
+   Xeon Scalable Processor </GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon/README.md>
+   Gaudi </GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/README.md>
diff --git a/examples/AgentQnA/deploy/index.rst b/examples/AgentQnA/deploy/index.rst
index 629d84ae..7ecf527a 100644
--- a/examples/AgentQnA/deploy/index.rst
+++ b/examples/AgentQnA/deploy/index.rst
@@ -6,9 +6,6 @@ AgentQnA Example Deployment Options
 Here are some deployment options, depending on your hardware and environment:
 
 Single Node
-***********
 
-.. toctree::
-   :maxdepth: 1
-
-   Xeon Scalable Processor <xeon>
\ No newline at end of file
+- **Xeon Scalable Processor**: `Xeon <https://opea-project.github.io/latest/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon/README.html>`_ 
+- **Gaudi**: `Gaudi <https://opea-project.github.io/latest/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/README.html>`_ 
diff --git a/examples/index.rst b/examples/index.rst
index 283693f3..d9392887 100644
--- a/examples/index.rst
+++ b/examples/index.rst
@@ -11,7 +11,6 @@ GenAIExamples are designed to give developers an easy entry into generative AI,
    ChatQnA/ChatQnA_Guide
    ChatQnA/deploy/index
    AgentQnA/AgentQnA_Guide
-   AgentQnA/deploy/index
    CodeGen/deploy/gaudi.md
    CodeGen/deploy/index
 

From b65df98ac6299a697826e7a92cca4916bcee24ff Mon Sep 17 00:00:00 2001
From: Ying Hu <ying.hu@intel.com>
Date: Wed, 20 Nov 2024 09:18:33 +0800
Subject: [PATCH 3/5] Update index.rst for CodeGen Guide structure (#255)

---
 examples/index.rst | 1 -
 1 file changed, 1 deletion(-)

diff --git a/examples/index.rst b/examples/index.rst
index d9392887..5d1e9934 100644
--- a/examples/index.rst
+++ b/examples/index.rst
@@ -11,7 +11,6 @@ GenAIExamples are designed to give developers an easy entry into generative AI,
    ChatQnA/ChatQnA_Guide
    ChatQnA/deploy/index
    AgentQnA/AgentQnA_Guide
-   CodeGen/deploy/gaudi.md
    CodeGen/deploy/index
 
 ----

From 75573a61b4effdda59c53ba43daaaa3f9bd8bbce Mon Sep 17 00:00:00 2001
From: alexsin368 <109180236+alexsin368@users.noreply.github.com>
Date: Wed, 20 Nov 2024 16:07:56 -0800
Subject: [PATCH 4/5] reorganize sample guides, add CodeGen top sample guide
 (#256)

* reorganize sample guides, add CodeGen top sample guide

Signed-off-by: alexsin368 <alex.sin@intel.com>

* include index.rst files in toctrees

Signed-off-by: alexsin368 <alex.sin@intel.com>

* correct toctree

Signed-off-by: alexsin368 <alex.sin@intel.com>

* fix indent

Signed-off-by: alexsin368 <alex.sin@intel.com>

---------

Signed-off-by: alexsin368 <alex.sin@intel.com>
---
 examples/ChatQnA/ChatQnA_Guide.rst | 14 +++++-----
 examples/CodeGen/CodeGen_Guide.rst | 45 ++++++++++++++++++++++++++++++
 examples/index.rst                 |  3 +-
 3 files changed, 53 insertions(+), 9 deletions(-)
 create mode 100644 examples/CodeGen/CodeGen_Guide.rst

diff --git a/examples/ChatQnA/ChatQnA_Guide.rst b/examples/ChatQnA/ChatQnA_Guide.rst
index d217f821..7e9b4779 100644
--- a/examples/ChatQnA/ChatQnA_Guide.rst
+++ b/examples/ChatQnA/ChatQnA_Guide.rst
@@ -204,16 +204,16 @@ The gateway serves as the interface for users to access. The gateway routes inco
 Deployment
 **********
 
-See the :ref:`chatqna-example-deployment` that includes both single-node and
+Here are some deployment options, including both single-node and
 orchestrated multi-node configurations, and choose the one that best fits your
-requirements.  Here are quick references to the single-node deployment options:
-
-* :doc:`Xeon Scalable Processor <deploy/xeon>`
-* :doc:`Gaudi AI Accelerator <deploy/gaudi>`
-* :doc:`Nvidia GPU <deploy/nvidia>`
-* :doc:`AI PC <deploy/aipc>`
+requirements. 
 
+.. toctree::
+   :maxdepth: 1
 
+   ChatQnA Deployment Options <deploy/index>
+   
+----
 
 Troubleshooting
 ***************
diff --git a/examples/CodeGen/CodeGen_Guide.rst b/examples/CodeGen/CodeGen_Guide.rst
new file mode 100644
index 00000000..dd64bb49
--- /dev/null
+++ b/examples/CodeGen/CodeGen_Guide.rst
@@ -0,0 +1,45 @@
+.. _Codegen_Guide:
+
+Codegen Sample Guide
+#####################
+
+.. note:: This guide is in its early development and is a work-in-progress with
+   placeholder content.
+
+Overview
+********
+
+The CodeGen example uses specialized AI models that went through training with datasets that 
+encompass repositories, documentation, programming code, and web data. With an understanding 
+of various programming languages, coding patterns, and software development concepts, the 
+CodeGen LLMs assist developers and programmers. The LLMs can be integrated into the developers' 
+Integrated Development Environments (IDEs) to have more contextual awareness to write more 
+refined and relevant code based on the suggestions. 
+
+Purpose
+*******
+* Code Generation: Streamline coding through Code Generation, enabling non-programmers to describe tasks for code creation.
+* Code Completion: Accelerate coding by suggesting contextually relevant snippets as developers type.
+* Code Translation and Modernization: Translate and modernize code across multiple programming languages, aiding interoperability and updating legacy projects.
+* Code Summarization: Extract key insights from codebases, improving readability and developer productivity.
+* Code Refactoring: Offer suggestions for code refactoring, enhancing code performance and efficiency.
+* AI-Assisted Testing: Assist in creating test cases, ensuring code robustness and accelerating development cycles.
+* Error Detection and Debugging: Detect errors in code and provide detailed descriptions and potential fixes, expediting debugging processes.
+
+How It Works
+************
+
+The CodeGen example uses an open-source code generation model with Text Generation Inference (TGI) 
+for serving deployment. It is presented as a Code Copilot application as shown in the diagram below. 
+
+.. figure:: /GenAIExamples/CodeGen/assets/img/codegen_architecture.png
+   :alt: CodeGen Architecture Diagram
+
+Deployment
+**********
+Here are some deployment options, depending on your hardware and environment:
+
+.. toctree::
+   :maxdepth: 1
+
+   CodeGen Deployment Options <deploy/index>
diff --git a/examples/index.rst b/examples/index.rst
index 5d1e9934..9273bf8a 100644
--- a/examples/index.rst
+++ b/examples/index.rst
@@ -9,9 +9,8 @@ GenAIExamples are designed to give developers an easy entry into generative AI,
    :maxdepth: 1
 
    ChatQnA/ChatQnA_Guide
-   ChatQnA/deploy/index
    AgentQnA/AgentQnA_Guide
-   CodeGen/deploy/index
+   CodeGen/CodeGen_Guide
 
 ----
 

From dc92900b7fffcc336ddeddb1142e63992d43b202 Mon Sep 17 00:00:00 2001
From: ZePan110 <ze.pan@intel.com>
Date: Thu, 21 Nov 2024 14:55:56 +0800
Subject: [PATCH 5/5] Rename image name XXX-hpu to XXX-gaudi (#254)

* Rename image name XXX-hpu to XXX-gaudi

Signed-off-by: ZePan110 <ze.pan@intel.com>

* Rename image name XXX-hpu to XXX-gaudi

Signed-off-by: ZePan110 <ze.pan@intel.com>

---------

Signed-off-by: ZePan110 <ze.pan@intel.com>
---
 examples/ChatQnA/deploy/gaudi.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/examples/ChatQnA/deploy/gaudi.md b/examples/ChatQnA/deploy/gaudi.md
index 926267fe..a68d0ce4 100644
--- a/examples/ChatQnA/deploy/gaudi.md
+++ b/examples/ChatQnA/deploy/gaudi.md
@@ -416,7 +416,7 @@ CONTAINER ID   IMAGE                                                   COMMAND
 ce4e7802a371   opea/retriever-redis:${TAG}                             "python retriever_re…"   About a minute ago   Up About a minute   0.0.0.0:7000->7000/tcp, :::7000->7000/tcp                                              retriever-redis-server
 be6cd2d0ea38   opea/reranking-tei:${TAG}                               "python reranking_te…"   About a minute ago   Up About a minute   0.0.0.0:8000->8000/tcp, :::8000->8000/tcp                                              reranking-tei-gaudi-server
 cc45ff032e8c   opea/tei-gaudi:${TAG}                                   "text-embeddings-rou…"   About a minute ago   Up About a minute   0.0.0.0:8090->80/tcp, :::8090->80/tcp                                                  tei-embedding-gaudi-server
-4969ec3aea02   opea/llm-vllm-hpu:${TAG}                                "/bin/bash -c 'expor…"   About a minute ago   Up About a minute   0.0.0.0:8007->80/tcp, :::8007->80/tcp                                                  vllm-gaudi-server
+4969ec3aea02   opea/vllm-gaudi:${TAG}                                  "/bin/bash -c 'expor…"   About a minute ago   Up About a minute   0.0.0.0:8007->80/tcp, :::8007->80/tcp                                                  vllm-gaudi-server
 0657cb66df78   redis/redis-stack:7.2.0-v9                              "/entrypoint.sh"         About a minute ago   Up About a minute   0.0.0.0:6379->6379/tcp, :::6379->6379/tcp, 0.0.0.0:8001->8001/tcp, :::8001->8001/tcp   redis-vector-db
 684d3e9d204a   ghcr.io/huggingface/text-embeddings-inference:cpu-1.2   "text-embeddings-rou…"   About a minute ago   Up About a minute   0.0.0.0:8808->80/tcp, :::8808->80/tcp                                                  tei-reranking-gaudi-server
 ```
@@ -863,7 +863,7 @@ View the docker input parameters in  `./ChatQnA/docker_compose/intel/hpu/gaudi/c
 
 ```yaml
   vllm-service:
-    image: ${REGISTRY:-opea}/llm-vllm-hpu:${TAG:-latest}
+    image: ${REGISTRY:-opea}/vllm-gaudi:${TAG:-latest}
     container_name: vllm-gaudi-server
     ports:
       - "8007:80"