Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jetstream Autoscaling Guide #703

Merged
merged 43 commits into from
Jun 17, 2024
Merged
Changes from 1 commit
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
8592376
first commit
Bslabe123 May 29, 2024
797cc16
missing files
Bslabe123 May 29, 2024
e6f9af4
Merge branch 'main' into jetstream-terraform
Bslabe123 May 29, 2024
94be180
various improvements
Bslabe123 May 29, 2024
112280f
some autoscaling changes for testing
Bslabe123 Jun 3, 2024
5b36027
add targetlabels to podmonitoring
Bslabe123 Jun 4, 2024
91f5be1
Revert repo pinning
Bslabe123 Jun 13, 2024
904315c
more reversions
Bslabe123 Jun 13, 2024
f79c8b3
more reversions
Bslabe123 Jun 13, 2024
d9b1fa7
cleanup
Bslabe123 Jun 13, 2024
3891577
more cleanup
Bslabe123 Jun 13, 2024
975722e
Added to README
Bslabe123 Jun 13, 2024
85c9b48
revert topology change
Bslabe123 Jun 13, 2024
bda1c5b
tweaks to deployment
Bslabe123 Jun 13, 2024
87fcd71
HPA terraform fixes
Bslabe123 Jun 13, 2024
fcf47d9
remove stray comment
Bslabe123 Jun 13, 2024
db8978a
Add more to README
Bslabe123 Jun 13, 2024
10da143
parameterize metrics scrape port
Bslabe123 Jun 13, 2024
fd7eb10
Cleaned up readme
Bslabe123 Jun 13, 2024
4cfc87a
readme tweak
Bslabe123 Jun 13, 2024
3079c1c
typo
Bslabe123 Jun 13, 2024
d182a7d
remove indentation
Bslabe123 Jun 13, 2024
63e9caf
newline
Bslabe123 Jun 13, 2024
a9ea9cc
Merge branch 'main' into jetstream-terraform
Bslabe123 Jun 13, 2024
4dc9bb0
More updates to readme
Bslabe123 Jun 13, 2024
af472a2
change wording
Bslabe123 Jun 13, 2024
bee7586
Update metrics scrape example
Bslabe123 Jun 13, 2024
0de153c
remove annotation
Bslabe123 Jun 13, 2024
7c08470
terraform format
Bslabe123 Jun 13, 2024
558ded5
missing comma
Bslabe123 Jun 13, 2024
f38595d
maxengine-server in terraform
Bslabe123 Jun 14, 2024
491bcac
wording
Bslabe123 Jun 14, 2024
9d02a8a
terraform fmt
Bslabe123 Jun 14, 2024
4ba7038
parameterize container images
Bslabe123 Jun 14, 2024
6e0edc2
wording
Bslabe123 Jun 14, 2024
07afa05
remove ksa var
Bslabe123 Jun 14, 2024
452b04f
move deployment to kubectl directory
Bslabe123 Jun 14, 2024
66ba238
App -> app
Bslabe123 Jun 14, 2024
31b5677
pipe from maxengine module to main
Bslabe123 Jun 14, 2024
cde9047
Update tutorials-and-examples/inference-servers/jetstream/maxtext/sin…
Bslabe123 Jun 15, 2024
9769e90
remove TODO
Bslabe123 Jun 15, 2024
532f45f
Merge branch 'jetstream-terraform' of https://github.com/GoogleCloudP…
Bslabe123 Jun 15, 2024
0dff592
HPA can now scale with HBM
Bslabe123 Jun 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
wording
  • Loading branch information
Bslabe123 committed Jun 14, 2024
commit 491bcacc8e62c654710614bec6dd2ef1ff30f842
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ Completed unscanning checkpoint to gs://BUCKET_NAME/final/unscanned/gemma_7b-it/

## Deploy Maxengine Server and HTTP Server

Next, deploy a Maxengine server hosting the Gemma-7b model. You can use the provided Maxengine server and HTTP server images already in `deployment.yaml` or [build your own](#build-and-upload-maxengine-server-image). Depending on your needs and constraints you can select to deploy your Maxengine server either via Terraform or via Kubectl.
Next, deploy a Maxengine server hosting the Gemma-7b model. You can use the provided Maxengine server and HTTP server images already in `deployment.yaml` or [build your own](#build-and-upload-maxengine-server-image). Depending on your needs and constraints you can elect to deploy either via Terraform or via Kubectl.

### Deploy via Kubectl

Expand Down
Loading