Reduces the minimum memory requirement for Prefect Server #16644

chrisguidry · 2025-01-08T16:17:26Z

This brings the minimum memory requirement for Prefect Server from
~400MB to ~345MB in my testing*.

What I discovered was that when when we use FastAPI's include_router,
FastAPI will create new routes rather than using a reference to the
previously declared router's routes. This is because additional
dependencies may be injected that the original router wasn't aware of,
so the new routes are built using the original router as a reference.

With Prefect's (likely very commonplace) pattern of declaring
module-level routers and then assembling them into a larger FastAPI
app, this means that there is an entire copy of the routes left in
memory that will likely never be touched again. Because the routes may
include dynamically generated Pydantic models representing the bodies of
requests, which themselves may be referring to other models that Prefect
defines, the complexity of the generated validators can balloon quickly.
In our case, something like ~50-55MB is held in references by those
routes.

*Note about the testing approach: I used a script that runs prefect server start in a cgroup to constrain the memory it has access to by
setting memory.max and by disabling swap memory with memory.swap=0.
It then polls the health check for 5 consecutive successes while
confirming that the process remains running. From there it performs
a binary search over the memory constraint until it finds the minimum
that allows for stable startup and health checks.

This new minimum of 345MB is probably too low for practical use, as some
additional memory would be required to run an actual flow, receive its
logs and events, and run periodic loop services.

Checklist

This pull request references any related issue by including "closes <link to issue>"
- If no issue exists and your change is not a small fix, please create an issue first.
If this pull request adds new functionality, it includes unit tests that cover the changes
If this pull request removes docs files, it includes redirect settings in mint.json.
If this pull request adds functions or classes, it includes helpful docstrings.

This brings the minimum memory requirement for Prefect Server from ~400MB to ~345MB in my testing*. What I discovered was that when when we use FastAPI's `include_router`, FastAPI will create _new_ routes rather than using a reference to the previously declared router's routes. This is because additional dependencies may be injected that the original router wasn't aware of, so the new routes are built using the original router as a reference. With Prefect's (likely very commonplace) pattern of declaring module-level routers and then assembling them into a larger FastAPI `app`, this means that there is an entire copy of the routes left in memory that will likely never be touched again. Because the routes may include dynamically generated Pydantic models representing the bodies of requests, which themselves may be referring to other models that Prefect defines, the complexity of the generated validators can balloon quickly. In our case, something like ~50-55MB is held in references by those routes. *Note about the testing approach: I used a script that runs `prefect server start` in a `cgroup` to constrain the memory it has access to by setting `memory.max` and by disabling swap memory with `memory.swap=0`. It then polls the health check for 5 consecutive successes while confirming that the process remains running. From there it performs a binary search over the memory constraint until it finds the minimum that allows for stable startup and health checks. This new minimum of 345MB is probably too low for practical use, as some additional memory would be required to run an actual flow, receive its logs and events, and run periodic loop services.

zzstoatzz · 2025-01-08T16:21:46Z

@chrisguidry sweet! i think CI is failing bc of the active docker incident

codspeed-hq · 2025-01-08T16:22:55Z

CodSpeed Performance Report

Merging #16644 will not alter performance

_{Comparing fastapi-memory-optimization (13312c3) with main (d58ecba)}

Summary

✅ 3 untouched benchmarks

cicdw

wow surprisingly simple to understand -- thank you @chrisguidry this is great!

zzstoatzz

🌟

In #16644, we were able to get the standalone Prefect server we run in `prefect server start` from 400MB to 343MB. By running `uvicorn` directly in the parent process, we further reduce that to *266MB*. I measured this using the same process as before, using a `cgroup` with no swap to constrain the memory until I found the minimum. There are two main changes here: 1. Making `prefect server start` a synchronous command in both the background and the foreground cases. This helps to simplify the handling of signals and removes the need for `anyio` or piping the streams of a child process manually. 2. Splitting the paths between background and foreground servers. In the case of the background server, we run the subprocess as usual and then exit leaving it running. In the case of the foreground server, we run it with `uvicorn.run` as the final call, letting `uvicorn` handle signals. `uvicorn` already handles `SIGINT` and `SIGTERM` to cleanly shutdown, so we don't need to do anything additional here. I'm removing the two tests that were testing what happens when the signals are sent in escalating succession, as they aren't materially different now. In both cases, `uvicorn` will attempt to shutdown cleanly. I also updated the tests to reflect that `SIGINT` will lead to a successful exit, while `SIGTERM` will lead to a -15 exit (pretty standard).

chrisguidry requested review from cicdw, desertaxle and zzstoatzz as code owners January 8, 2025 16:17

Merge branch 'main' into fastapi-memory-optimization

2603839

chrisguidry and others added 2 commits January 8, 2025 11:23

Merge branch 'main' into fastapi-memory-optimization

41ff850

Merge branch 'main' into fastapi-memory-optimization

13312c3

cicdw approved these changes Jan 8, 2025

View reviewed changes

zzstoatzz approved these changes Jan 8, 2025

View reviewed changes

chrisguidry merged commit 0fc38cf into main Jan 8, 2025
45 checks passed

chrisguidry deleted the fastapi-memory-optimization branch January 8, 2025 18:31

chrisguidry mentioned this pull request Jan 8, 2025

Reduces the memory requirement of a standalone Prefect server again #16651

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduces the minimum memory requirement for Prefect Server #16644

Reduces the minimum memory requirement for Prefect Server #16644

chrisguidry commented Jan 8, 2025 •

edited

Loading

zzstoatzz commented Jan 8, 2025

codspeed-hq bot commented Jan 8, 2025 •

edited

Loading

cicdw left a comment

zzstoatzz left a comment

Reduces the minimum memory requirement for Prefect Server #16644

Reduces the minimum memory requirement for Prefect Server #16644

Conversation

chrisguidry commented Jan 8, 2025 • edited Loading

Checklist

zzstoatzz commented Jan 8, 2025

codspeed-hq bot commented Jan 8, 2025 • edited Loading

CodSpeed Performance Report

Merging #16644 will not alter performance

Summary

cicdw left a comment

Choose a reason for hiding this comment

zzstoatzz left a comment

Choose a reason for hiding this comment

chrisguidry commented Jan 8, 2025 •

edited

Loading

codspeed-hq bot commented Jan 8, 2025 •

edited

Loading