[Doc] Develop Ray Serve Python script on KubeRay (#1250)

Develop Ray Serve Python script on KubeRay
ray-project · Jul 18, 2023 · e9a2698 · e9a2698
1 parent b26f106
commit e9a2698
Show file tree

Hide file tree

Showing 2 changed files with 131 additions and 2 deletions.
diff --git a/docs/guidance/rayserve-dev-doc.md b/docs/guidance/rayserve-dev-doc.md
@@ -0,0 +1,129 @@
+# Developing Ray Serve Python scripts on a RayCluster
+
+In this tutorial, you will learn how to effectively debug your Ray Serve scripts against a RayCluster, enabling enhanced observability and faster iteration speed compared to developing the script directly with a RayService.
+Many RayService issues are related to the Ray Serve Python scripts, so it is important to ensure the correctness of the scripts before deploying them to a RayService.
+This tutorial will show you how to develop a Ray Serve Python script for a MobileNet image classifier on a RayCluster.
+You can deploy and serve the classifier on your local Kind cluster without requiring a GPU.
+Please refer to [ray-service.mobilenet.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-service.mobilenet.yaml) and [mobilenet-rayservice.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/mobilenet-rayservice.md) for more details.
+
+
+# Step 1: Install a KubeRay cluster
+
+Follow this [document](../../helm-chart/kuberay-operator/README.md) to install the latest stable KubeRay operator via Helm repository.
+
+# Step 2: Create a RayCluster CR
+
+```sh
+helm install raycluster kuberay/ray-cluster --version 0.6.0-rc.0
+```
+
+# Step 3: Log in to the head Pod
+
+```sh
+export HEAD_POD=$(kubectl get pods --selector=ray.io/node-type=head -o custom-columns=POD:metadata.name --no-headers)
+kubectl exec -it $HEAD_POD -- bash
+```
+
+# Step 4: Prepare your Ray Serve Python scripts and run the Ray Serve application
+
+```sh
+# Execute the following command in the head Pod
+git clone https://github.com/ray-project/serve_config_examples.git
+cd serve_config_examples
+
+# Try to launch the Ray Serve application
+serve run mobilenet.mobilenet:app
+# [Error message]
+#     from tensorflow.keras.preprocessing import image
+# ModuleNotFoundError: No module named 'tensorflow'
+```
+
+* `serve run mobilenet.mobilenet:app`: The first `mobilenet` is the name of the directory in the `serve_config_examples/`,
+the second `mobilenet` is the name of the Python file in the directory `mobilenet/`, and `app` is the name of the variable representing Ray Serve application within the Python file. See the section "import_path" in [rayservice-troubleshooting.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayservice-troubleshooting.md) for more details.
+
+# Step 5: Change the Ray image from `rayproject/ray:${RAY_VERSION}` to `rayproject/ray-ml:${RAY_VERSION}`
+
+```sh
+# Uninstall RayCluster
+helm uninstall raycluster
+
+# Install the RayCluster CR with the Ray image `rayproject/ray-ml:${RAY_VERSION}`
+helm install raycluster kuberay/ray-cluster --version 0.6.0-rc.0 --set image.repository=rayproject/ray-ml
+```
+
+The error message in Step 4 indicates that the Ray image `rayproject/ray:${RAY_VERSION}` does not have the TensorFlow package.
+Due to the significant size of TensorFlow, we have opted to use an image with TensorFlow as the base instead of installing it within the Ray [runtime environment](https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#runtime-environments).
+In this Step, we will change the Ray image from `rayproject/ray:${RAY_VERSION}` to `rayproject/ray-ml:${RAY_VERSION}`.
+
+# Step 6: Repeat Step 3 and Step 4
+
+```sh
+# Repeat Step 3 and Step 4 to log in to the new head Pod and run the Ray Serve application.
+# You should successfully launch the Ray Serve application this time.
+serve run mobilenet.mobilenet:app
+
+# [Example output]
+# (ServeReplica:default_ImageClassifier pid=139, ip=10.244.0.8) Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224.h5
+#     8192/14536120 [..............................] - ETA: 0s)
+#  4202496/14536120 [=======>......................] - ETA: 0s)
+# 12902400/14536120 [=========================>....] - ETA: 0s)
+# 14536120/14536120 [==============================] - 0s 0us/step
+# 2023-07-17 14:04:43,737 SUCC scripts.py:424 -- Deployed Serve app successfully.
+```
+
+# Step 7: Submit a request to the Ray Serve application
+
+```sh
+# (On your local machine) Forward the serve port of the head Pod
+kubectl port-forward --address 0.0.0.0 $HEAD_POD 8000
+
+# Clone the repository on your local machine
+git clone https://github.com/ray-project/serve_config_examples.git
+cd serve_config_examples/mobilenet
+
+# Prepare a sample image file. `stable_diffusion_example.png` is a cat image generated by the Stable Diffusion model.
+curl -O https://raw.githubusercontent.com/ray-project/kuberay/master/docs/images/stable_diffusion_example.png
+
+# Update `image_path` in `mobilenet_req.py` to the path of `stable_diffusion_example.png`
+# Send a request to the Ray Serve application.
+python3 mobilenet_req.py
+
+# [Error message]
+# Unexpected error, traceback: ray::ServeReplica:default_ImageClassifier.handle_request() (pid=139, ip=10.244.0.8)
+#   File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/_private/utils.py", line 254, in wrap_to_ray_error
+#     raise exception
+#   File "/home/ray/anaconda3/lib/python3.7/site-packages/ray/serve/_private/replica.py", line 550, in invoke_single
+#     result = await method_to_call(*args, **kwargs)
+#   File "./mobilenet/mobilenet.py", line 24, in __call__
+#   File "/home/ray/anaconda3/lib/python3.7/site-packages/starlette/requests.py", line 256, in _get_form
+#     ), "The `python-multipart` library must be installed to use form parsing."
+# AssertionError: The `python-multipart` library must be installed to use form parsing..
+```
+
+`python-multipart` is required for the request parsing function `starlette.requests.form()`, so the error message is reported when we send a request to the Ray Serve application.
+
+# Step 8: Restart the Ray Serve application with runtime environment.
+
+```sh
+# In the head Pod, stop the Ray Serve application
+serve shutdown
+
+# Check the Ray Serve application status
+serve status
+# [Example output]
+# There are no applications running on this cluster.
+
+# Launch the Ray Serve application with runtime environment.
+serve run mobilenet.mobilenet:app --runtime-env-json='{"pip": ["python-multipart==0.0.6"]}'
+
+# (On your local machine) Submit a request to the Ray Serve application again, and you should get the correct prediction.
+python3 mobilenet_req.py
+# [Example output]
+# {"prediction": ["n02123159", "tiger_cat", 0.2994779646396637]}
+```
+
+# Step 9: Create a RayService YAML file
+
+In the previous steps, we found that the Ray Serve application can be successfully launched using the Ray image `rayproject/ray-ml:${RAY_VERSION}` and the [runtime environment](https://docs.ray.io/en/latest/ray-core/handling-dependencies.html#runtime-environments) `python-multipart==0.0.6`.
+Therefore, we can create a RayService YAML file with the same Ray image and runtime environment.
+For more details, please refer to [ray-service.mobilenet.yaml](https://github.com/ray-project/kuberay/blob/master/ray-operator/config/samples/ray-service.mobilenet.yaml) and [mobilenet-rayservice.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/mobilenet-rayservice.md).
diff --git a/docs/guidance/rayservice-troubleshooting.md b/docs/guidance/rayservice-troubleshooting.md
@@ -78,7 +78,7 @@ kubectl exec -it $HEAD_POD -- ray summary actors
 ### Issue 1: Ray Serve script is incorrect.
 
 We strongly recommend that you test your Ray Serve script locally or in a RayCluster before
-deploying it to a RayService. [TODO: https://github.com/ray-project/kuberay/issues/1176]
+deploying it to a RayService. Please refer to [rayserve-dev-doc.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayserve-dev-doc.md) for more details.
 
 ### Issue 2: `serveConfigV2` is incorrect.
 
@@ -101,7 +101,7 @@ Therefore, the YAML file includes `python-multipart` in the runtime environment.
 
 ### Issue 3-2: Examples for troubleshooting dependency issues.
 
-> Note: We highly recommend testing your Ray Serve script locally or in a RayCluster before deploying it to a RayService. This helps identify any dependency issues in the early stages. [TODO: https://github.com/ray-project/kuberay/issues/1176]
+> Note: We highly recommend testing your Ray Serve script locally or in a RayCluster before deploying it to a RayService. This helps identify any dependency issues in the early stages. Please refer to [rayserve-dev-doc.md](https://github.com/ray-project/kuberay/blob/master/docs/guidance/rayserve-dev-doc.md) for more details.
 
 In the [MobileNet example](mobilenet-rayservice.md), the [mobilenet.py](https://github.com/ray-project/serve_config_examples/blob/master/mobilenet/mobilenet.py) consists of two functions: `__init__()` and `__call__()`.
 The function `__call__()` will only be called when the Serve application receives a request.