Skip to content

Commit

Permalink
[Doc] Develop Ray Serve Python script on KubeRay (#1250)
Browse files Browse the repository at this point in the history
Develop Ray Serve Python script on KubeRay
  • Loading branch information
kevin85421 committed Jul 18, 2023
1 parent b26f106 commit e9a2698
Show file tree
Hide file tree
Showing 2 changed files with 131 additions and 2 deletions.
129 changes: 129 additions & 0 deletions docs/guidance/rayserve-dev-doc.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# Developing Ray Serve Python scripts on a RayCluster

In this tutorial, you will learn how to effectively debug your Ray Serve scripts against a RayCluster, enabling enhanced observability and faster iteration speed compared to developing the script directly with a RayService.
Many RayService issues are related to the Ray Serve Python scripts, so it is important to ensure the correctness of the scripts before deploying them to a RayService.
This tutorial will show you how to develop a Ray Serve Python script for a MobileNet image classifier on a RayCluster.
You can deploy and serve the classifier on your local Kind cluster without requiring a GPU.
Please refer to [ray-service.mobilenet.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-service.mobilenet.yaml) and [mobilenet-rayservice.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/mobilenet-rayservice.md) for more details.


# Step 1: Install a KubeRay cluster

Follow this [document](../../helm-chart/kuberay-operator/README.md) to install the latest stable KubeRay operator via Helm repository.

# Step 2: Create a RayCluster CR

```sh
helm install raycluster kuberay/ray-cluster --version 0.6.0-rc.0
```

# Step 3: Log in to the head Pod

```sh
export HEAD_POD=$(kubectl get pods --selector=ray.io/node-type=head -o custom-columns=POD:metadata.name --no-headers)
kubectl exec -it $HEAD_POD -- bash
```

# Step 4: Prepare your Ray Serve Python scripts and run the Ray Serve application

```sh
# Execute the following command in the head Pod
git clone https://github.com/ray-project/serve_config_examples.git
cd serve_config_examples

# Try to launch the Ray Serve application
serve run mobilenet.mobilenet:app
# [Error message]
# from tensorflow.keras.preprocessing import image
# ModuleNotFoundError: No module named 'tensorflow'
```

* `serve run mobilenet.mobilenet:app`: The first `mobilenet` is the name of the directory in the `serve_config_examples/`,
the second `mobilenet` is the name of the Python file in the directory `mobilenet/`, and `app` is the name of the variable representing Ray Serve application within the Python file. See the section "import_path" in [rayservice-troubleshooting.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayservice-troubleshooting.md) for more details.

# Step 5: Change the Ray image from `rayproject/ray:${RAY_VERSION}` to `rayproject/ray-ml:${RAY_VERSION}`

```sh
# Uninstall RayCluster
helm uninstall raycluster

# Install the RayCluster CR with the Ray image `rayproject/ray-ml:${RAY_VERSION}`
helm install raycluster kuberay/ray-cluster --version 0.6.0-rc.0 --set image.repository=rayproject/ray-ml
```

The error message in Step 4 indicates that the Ray image `rayproject/ray:${RAY_VERSION}` does not have the TensorFlow package.
Due to the significant size of TensorFlow, we have opted to use an image with TensorFlow as the base instead of installing it within the Ray [runtime environment](https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#runtime-environments).
In this Step, we will change the Ray image from `rayproject/ray:${RAY_VERSION}` to `rayproject/ray-ml:${RAY_VERSION}`.

# Step 6: Repeat Step 3 and Step 4

```sh
# Repeat Step 3 and Step 4 to log in to the new head Pod and run the Ray Serve application.
# You should successfully launch the Ray Serve application this time.
serve run mobilenet.mobilenet:app

# [Example output]
# (ServeReplica:default_ImageClassifier pid=139, ip=10.244.0.8) Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224.h5
# 8192/14536120 [..............................] - ETA: 0s)
# 4202496/14536120 [=======>......................] - ETA: 0s)
# 12902400/14536120 [=========================>....] - ETA: 0s)
# 14536120/14536120 [==============================] - 0s 0us/step
# 2023-07-17 14:04:43,737 SUCC scripts.py:424 -- Deployed Serve app successfully.
```

# Step 7: Submit a request to the Ray Serve application

```sh
# (On your local machine) Forward the serve port of the head Pod
kubectl port-forward --address 0.0.0.0 $HEAD_POD 8000

# Clone the repository on your local machine
git clone https://github.com/ray-project/serve_config_examples.git
cd serve_config_examples/mobilenet

# Prepare a sample image file. `stable_diffusion_example.png` is a cat image generated by the Stable Diffusion model.
curl -O https://raw.githubusercontent.com/ray-project/kuberay/master/docs/images/stable_diffusion_example.png

# Update `image_path` in `mobilenet_req.py` to the path of `stable_diffusion_example.png`
# Send a request to the Ray Serve application.
python3 mobilenet_req.py

# [Error message]
# Unexpected error, traceback: ray::ServeReplica:default_ImageClassifier.handle_request() (pid=139, ip=10.244.0.8)
# File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/_private/utils.py", line 254, in wrap_to_ray_error
# raise exception
# File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/_private/replica.py", line 550, in invoke_single
# result = await method_to_call(*args, **kwargs)
# File "./mobilenet/mobilenet.py", line 24, in __call__
# File "/home/ray/anaconda3/lib/python3.7/site-packages/starlette/requests.py", line 256, in _get_form
# ), "The `python-multipart` library must be installed to use form parsing."
# AssertionError: The `python-multipart` library must be installed to use form parsing..
```

`python-multipart` is required for the request parsing function `starlette.requests.form()`, so the error message is reported when we send a request to the Ray Serve application.

# Step 8: Restart the Ray Serve application with runtime environment.

```sh
# In the head Pod, stop the Ray Serve application
serve shutdown

# Check the Ray Serve application status
serve status
# [Example output]
# There are no applications running on this cluster.

# Launch the Ray Serve application with runtime environment.
serve run mobilenet.mobilenet:app --runtime-env-json='{"pip": ["python-multipart==0.0.6"]}'

# (On your local machine) Submit a request to the Ray Serve application again, and you should get the correct prediction.
python3 mobilenet_req.py
# [Example output]
# {"prediction": ["n02123159", "tiger_cat", 0.2994779646396637]}
```

# Step 9: Create a RayService YAML file

In the previous steps, we found that the Ray Serve application can be successfully launched using the Ray image `rayproject/ray-ml:${RAY_VERSION}` and the [runtime environment](https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#runtime-environments) `python-multipart==0.0.6`.
Therefore, we can create a RayService YAML file with the same Ray image and runtime environment.
For more details, please refer to [ray-service.mobilenet.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-service.mobilenet.yaml) and [mobilenet-rayservice.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/mobilenet-rayservice.md).
4 changes: 2 additions & 2 deletions docs/guidance/rayservice-troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ kubectl exec -it $HEAD_POD -- ray summary actors
### Issue 1: Ray Serve script is incorrect.

We strongly recommend that you test your Ray Serve script locally or in a RayCluster before
deploying it to a RayService. [TODO: https://github.com/ray-project/kuberay/issues/1176]
deploying it to a RayService. Please refer to [rayserve-dev-doc.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayserve-dev-doc.md) for more details.

### Issue 2: `serveConfigV2` is incorrect.

Expand All @@ -101,7 +101,7 @@ Therefore, the YAML file includes `python-multipart` in the runtime environment.

### Issue 3-2: Examples for troubleshooting dependency issues.

> Note: We highly recommend testing your Ray Serve script locally or in a RayCluster before deploying it to a RayService. This helps identify any dependency issues in the early stages. [TODO: https://github.com/ray-project/kuberay/issues/1176]
> Note: We highly recommend testing your Ray Serve script locally or in a RayCluster before deploying it to a RayService. This helps identify any dependency issues in the early stages. Please refer to [rayserve-dev-doc.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayserve-dev-doc.md) for more details.
In the [MobileNet example](mobilenet-rayservice.md), the [mobilenet.py](https://github.com/ray-project/serve_config_examples/blob/master/mobilenet/mobilenet.py) consists of two functions: `__init__()` and `__call__()`.
The function `__call__()` will only be called when the Serve application receives a request.
Expand Down

0 comments on commit e9a2698

Please sign in to comment.