Skip to content

Commit

Permalink
[neo] Fix sharding script bug (#2549)
Browse files Browse the repository at this point in the history
  • Loading branch information
ethnzhng authored Nov 12, 2024
1 parent 0850a3a commit 62a63be
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 2 deletions.
3 changes: 2 additions & 1 deletion serving/docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ Currently, we created docker compose to simplify the building experience. Just r
```shell
cd serving/docker
export DJL_VERSION=$(awk -F '=' '/djl / {gsub(/ ?"/, "", $2); print $2}' ../../gradle/libs.versions.toml)
docker compose build --build-arg djl_version=${DJL_VERSION} <compose-target>
export SERVING_VERSION=$(awk -F '=' '/serving / {gsub(/ ?"/, "", $2); print $2}' ../../gradle/libs.versions.toml)
docker compose build --build-arg djl_version=${DJL_VERSION} --build-arg djl_serving_version=${SERVING_VERSION} <compose-target>
```

You can find different `compose-target` in `docker-compose.yml`, like `cpu`, `lmi`...
Expand Down
4 changes: 3 additions & 1 deletion serving/docker/partition/sm_neo_shard.py
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,9 @@ def shard_lmi_dist_model(self, input_dir: str, output_dir: str,
False)).lower() == "true"
max_rolling_batch_size = int(
self.properties.get("option.max_rolling_batch_size", 256))
max_model_len = int(self.properties.get("option.max_model_len", None))
max_model_len = self.properties.get("option.max_model_len", None)
if max_model_len is not None:
max_model_len = int(max_model_len)

engine_args = VllmEngineArgs(
model=input_dir,
Expand Down

0 comments on commit 62a63be

Please sign in to comment.