Skip to content

The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge

Notifications You must be signed in to change notification settings


Repository files navigation


This project is a RESTful API server that provides image generation and editing services based on Stable Diffusion models. The APIs are compatible with OpenAI APIs of image generation and editing.


The project is still under active development. The existing features still need to be improved and more features will be added in the future.

Quick Start


  • Install WasmEdge v0.14.1

    curl -sSf | bash -s -- -v 0.14.1
  • Deply wasmedge_stablediffusion plugin

    For the purpose of demonstration, we will use the stable diffusion plugin for Mac Apple Silicon. You can find the plugin for other platforms Releases/0.14.1

    # Download stable diffusion plugin for Mac Apple Silicon
    curl -LO
    # Unzip the plugin to $HOME/.wasmedge/plugin
    tar -xzf WasmEdge-plugin-wasmedge_stablediffusion-0.14.1-darwin_arm64.tar.gz -C $HOME/.wasmedge/plugin
    rm $HOME/.wasmedge/plugin/libwasmedgePluginWasiNN.dylib

Run sd-api-server

  • Download the stable diffusion model

    curl -LO

    The available stable diffusion models:

  • Download sd-api-server.wasm

    curl -LO
  • Start the server

    wasmedge --dir .:. sd-api-server.wasm --model-name sd-v1.4 --model stable-diffusion-v1-4-Q8_0.gguf

    [!TIP] sd-api-server will use 8080 port by default. You can change the port by adding --port <port>.

    • Reduce the memory usage

      In the default setting, the server support two tasks: text2image for image generations and image2image for image edits. If you want to run one of them, you can specify the task type by adding --task <task-type>. For example, if you only want to run image generations, then just start the server with the following command:

      wasmedge --dir .:. sd-api-server.wasm --model-name sd-v1.4 --model stable-diffusion-v1-4-Q8_0.gguf --task text2image


Image Generation

  • Send a request for image generation

    curl -X POST 'http://localhost:8080/v1/images/generations' \
      --header 'Content-Type: application/json' \
      --data '{
          "model": "sd-v1.4",
          "prompt": "A cute baby sea otter"

    If the request is handled successfully, the server will return a JSON response like the following:

      "created": 1723431133,
      "data": [
              "url": "/archives/file_74f514a2-8d33-4f9d-bcc0-42e8db14ecbc/output.png",
              "prompt": "A cute baby sea otter"
  • Preview the generated image

A cute baby sea otter

Image Editing

  • Send a request for image editing

    curl --location 'http://localhost:8080/v1/images/edits' \
      --form 'image=@"otter.png"' \
      --form 'prompt="A cute baby sea otter with blue eyes"'

    If the request is handled successfully, the server will return a JSON response like the below. To preview or download the generated image, copy and paste the URL to your browser.

      "created": 1723432689,
      "data": [
              "url": "http://localhost:8080/v1/files/download/file_554e4d53-6072-4988-83e6-fe684655a734"
              "prompt": "A cute baby sea otter with blue eyes"
  • Preview the edited image

A cute baby sea otter with blue eyes


  • For Linux users

    cargo build --release
  • For macOS users

    • Download the wasi-sdk from the official website and unzip it to the directory you want.

    • Build the project

      export WASI_SDK_PATH=/path/to/wasi-sdk
      export CC="${WASI_SDK_PATH}/bin/clang --sysroot=${WASI_SDK_PATH}/share/wasi-sysroot"
      cargo clean
      cargo update
      cargo build --release

If the build process is successful, sd-api-server.wasm will be generated in target/wasm32-wasip1/release/.

CLI Options

$ wasmedge target/wasm32-wasip1/release/sd-api-server.wasm -h

LlamaEdge-Stable-Diffusion API Server

Usage: sd-api-server.wasm [OPTIONS] --model-name <MODEL_NAME> <--model <MODEL>|--diffusion-model <DIFFUSION_MODEL>>

  -m, --model-name <MODEL_NAME>
          Sets the model name
      --model <MODEL>
          Path to full model [default: ]
      --diffusion-model <DIFFUSION_MODEL>
          Path to the standalone diffusion model file [default: ]
      --vae <VAE>
          Path to vae [default: ]
      --clip-l <CLIP_L>
          Path to the clip-l text encoder [default: ]
      --t5xxl <T5XXL>
          Path to the the t5xxl text encoder [default: ]
      --lora-model-dir <LORA_MODEL_DIR>
          Path to the lora model directory
      --control-net <CONTROL_NET>
          Path to control net model
          Keep controlnet on cpu (for low vram)
      --threads <THREADS>
          Number of threads to use during computation. Default is -1, which means to use all available threads [default: -1]
          Keep clip on cpu (for low vram)
          Keep vae on cpu (for low vram)
      --task <TASK>
          Task type [default: full] [possible values: text2image, image2image, full]
      --socket-addr <SOCKET_ADDR>
          Socket address of LlamaEdge API Server instance. For example, ``
      --port <PORT>
          Port number [default: 8080]
      --download-url-prefix <DOWNLOAD_URL_PREFIX>
          Download URL prefix, format: `http(s)://{IPv4_address}:{port}` or `http(s)://{domain}:{port}`
  -h, --help
          Print help (see more with '--help')
  -V, --version
          Print version


The easiest & fastest way to run customized and fine-tuned LLMs locally or on the edge






No packages published