Skip to content

Latest commit

 

History

History
83 lines (53 loc) · 2.25 KB

README.md

File metadata and controls

83 lines (53 loc) · 2.25 KB

Ray on GKE

This repository contains a Terraform template for running Ray on Google Kubernetes Engine. We've also included some example notebooks, including one that serves a GPT-J-6B model with Ray AIR (see here for the original notebook).

The solution is split into platform and user resources.

Platform resources (deployed once):

  • GKE Cluster
  • Nvidia GPU drivers
  • Kuberay operator and CRDs

User resources (deployed once per user):

  • User namespace
  • Kubernetes service accounts
  • Kuberay cluster
  • Prometheus monitoring
  • Logging container
  • Jupyter notebook

Installation

Platform

  1. cd platform

  2. Edit variables.tf with your GCP settings.

  3. Run terraform init

  4. Run terraform apply

User

  1. cd user

  2. Edit variables.tf with your GCP settings.

  3. Run terraform init

  4. Run terraform apply

Using Ray

  1. Run kubectl get services -n <namespace>

  2. Copy the external IP for the notebook.

  3. Open the external IP in a browser and login.

  4. The Ray cluster is available at ray://example-cluster-kuberay-head-svc:10001. To access the cluster, you can open one of the sample notebooks under example_notebooks (via File -> Open from URL in the Jupyter notebook window and use the raw file URL from GitHub) and run through the example.

Securing Your Cluster Endpoints

For demo purposes, this repo creates a public IP for the Ray head node and the Jupyter notebook. To secure your cluster, it is strong recommended to replace this with your own secure endpoints.

For more information, please take a look at the following links:

Running GPT-J-6B

This example is adapted from Ray AIR's examples here.

  1. Open the gpt-j-online.ipynb notebook.

  2. Open a terminal in the Jupyter session and install Ray AIR:

pip install ray[air]
  1. Run through the notebook cells. You can change the prompt in the last cell:
prompt = (
     ## Input your own prompt here
)
  1. This should output a generated text response.