Skip to content

Latest commit

 

History

History
26 lines (13 loc) · 1.75 KB

README.md

File metadata and controls

26 lines (13 loc) · 1.75 KB

Burr Serving with BentoML

This repository shows how to deploy a Burr Application with BentoML.

Overview

Burr and BentoML help you build the application and serving layers of your system.

Application layer

Burr creates easy to understand and debug applications with a clear path to production. It supports synchronous, asynchronous, and streaming actions. Persistence & durability, hooks, and telemetry features are built-in.

Serving layer

BentoML is a specialized tool to package, deploy, and manage AI services. Get the most performance from your system by specifying resource requirements (CPU, GPU, RAM, concurrency, workers, etc.), autoscaling, and adaptive batching for requests. It also automatically generates synchronous and asynchronous clients for your service.

Directory Content

  • web_page_qna/ is an introductory example to deploy with BentoML a Burr Application that uses LLMs to answer questions about a web page.

Community

Join the BentoML developer community on Slack for more support and discussions!

Join the Burr Discord server for help, questions, and feature requests.