Caveats of using KubeRay cluster for Ray Serve

Here we describe the main caveats of using KubeRay for deploying and running Ray Serve. It is mostly based on this documentation

Install Kuberay operator

Install Kuberay operator following documentation

kubectl create -k "github.com/ray-project/kuberay/ray-operator/config/default?ref=v0.6.0&timeout=90s"

Configuration of the cluster itself

Unfortunately usage of Ray serve requires a bit of specific cluster configuration. The example of of such configuration is here. The most important Serve specific things there are:

Line 20 - defining dashboard-agent-listen-port, that determines the port for serve management APIs
Lines 51-52 defining ports for dashboard agent.

With this in place, we can use dashboard-agent-listen-port for accessing serve APIs. We can either port-forward or create additional route for accessing it.

Implementing Serve code

We have here to serve examples - hello and fruit borrowed from Ray documentation

Deploying code

Once the code is created, we need to:

Package it to the docker image
Create deployment config file using serve build command

For our example the commands look like follows:

serve build hello:graph -o hello.yaml
serve build fruit:deployment_graph -o fruit.yaml

These 2 commands will produce yaml files here and here The base yaml files presented here can be further enchanced based on documentation. Most common overrides include number of replicas, and deployment parameters.

Deploying to Ray cluster

Once yaml files are in place we can use serve deploy to deploy them. Serve deploy is a thin wrapper over HTTP APIs, that can be used directly. Definitions of the Rest APIs can be found here

For our example we first do port-forward:

kubectl port-forward svc/raycluster-heterogeneous-head-svc 52365 -n max

And then use the following commands:

serve deploy hello.yaml
serve deploy fruit.yaml

The newer Rest APIs allow for supporting serve applications and Allow to deploy both serve applications

Once the application is installed you can also see configuration in the Ray dashboard

Accessing applications

Following this, do port-forward:

kubectl port-forward svc/raycluster-heterogeneous-head-svc 8000 -n max

And then use this command:

curl -H "Content-Type: application/json" -d '["PEAR", 2]' "http://localhost:8000/"
curl "http://localhost:8000/?name=Ray"

Undeploying

The only command is:

serve shutdown

Deploying multiple applications

Following documentation Ray now supports deploying multiple independent Serve applications.

To try this, we first need to modify fruit and hello to ensure that they are listening on different URLs fruit_url and hello_url

Once this is done, the following command generates deployment yaml:

serve build --multi-app fruit_url:graph hello_url:graph -o multi_app.yaml

The auto-generated application names default to app1, app2, so I changed them in generated yaml. Finally we need to add newly created python files fruit_url and hello_url to the docker file and rebuild our image.

When this is done and cluster is restarted, we can deploy our applications as follows:

serve deploy multi_app.yaml

Alternatively we can deploy using HTTP:

curl -X PUT http://localhost:52365/api/serve/applications/ -H 'Content-Type: application/json' -d '{"proxy_location": "EveryNode", "http_options": {"host": "0.0.0.0", "port": 8000},
                  "applications": [{"name": "fruit", "route_prefix": "/fruit", "import_path": "fruit_url:graph",
                                    "runtime_env": {},
                                    "deployments": [{"name": "MangoStand", "user_config": {"price": 3}},
                                                    {"name": "OrangeStand", "user_config": {"price": 2}},
                                                    {"name": "PearStand", "user_config": {"price": 4}},
                                                    {"name": "FruitMarket", "num_replicas": 2},
                                                    {"name": "DAGDriver"}]},
                                    {"name": "greet", "route_prefix": "/greet", "import_path": "hello_url:graph",
                                     "runtime_env": {},
                                     "deployments": [{"name": "Doubler"},
                                                     {"name": "HelloDeployment"},
                                                     {"name": "DAGDriver"}]}]}'

Once deployment is completed, you can port forward:

kubectl port-forward svc/raycluster-heterogeneous-head-svc 8000 -n max

and run:

curl "http://localhost:8000/greet/?name=Ray"
curl -H "Content-Type: application/json" -d '["PEAR", 2]' "http://localhost:8000/fruit/"

Alternatively you can use POST. Also for curl, note a tip

In addition to port-forward, you can create a route exposing port 8000 and using it for invocation.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Python		Python
Dockerfile		Dockerfile
Ray_serve.iml		Ray_serve.iml
example.yaml		example.yaml
fruit.py		fruit.py
fruit.yaml		fruit.yaml
fruit_url.py		fruit_url.py
fruit_url.yaml		fruit_url.yaml
hello.py		hello.py
hello.yaml		hello.yaml
hello_url.py		hello_url.py
hello_url.yaml		hello_url.yaml
multi_app.yaml		multi_app.yaml
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Caveats of using KubeRay cluster for Ray Serve

Install Kuberay operator

Configuration of the cluster itself

Implementing Serve code

Deploying code

Deploying to Ray cluster

Accessing applications

Undeploying

Deploying multiple applications

About

Releases

Packages

Languages

blublinsky/ray-serve

Folders and files

Latest commit

History

Repository files navigation

Caveats of using KubeRay cluster for Ray Serve

Install Kuberay operator

Configuration of the cluster itself

Implementing Serve code

Deploying code

Deploying to Ray cluster

Accessing applications

Undeploying

Deploying multiple applications

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages