-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stateless plugins #8112
Stateless plugins #8112
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is fantastic. 🚢
if validated_data["is_stateless"] and len(validated_data["config_schema"]) > 0: | ||
raise ValidationError("Stateless plugins cannot have a config!") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smart
} else { | ||
pluginConfig.vm = new LazyPluginVM() | ||
pluginVMLoadPromises.push(loadPlugin(server, pluginConfig)) | ||
|
||
if (prevConfig) { | ||
void teardownPlugins(server, prevConfig) | ||
} | ||
|
||
if (plugin?.is_stateless) { | ||
statelessVms[plugin.id] = pluginConfig.vm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so clean. I love it
Changes
This reasonably small PR should bring us massive performance improvements on Cloud.
It introduces stateless plugins, which are plugins for which we can reuse VMs across teams as they do not rely on team-specific context.
Currently, we spin up 3091 VMs for GeoIp on each plugin server thread. That number is about to drop to 1.
Following Marius' calculation here, a simple VM takes up around 200kb of memory, with more complicated plugins taking up more. Using that lower bound, we should save about 620mb (
3091 * 200 - 200
) of memory per thread per plugin server. That gives us savings of at least 2.5gb of RAM per plugins task on ECS (each task runs with 4 CPUs, thus 4 threads).Not to mention that this should really speed up our time to ingestion on reboots.
The more difficult part of stateless plugins would be if we could enforce/infer stateless plugins automatically, but both me and @mariusandra thought that's far from important right now. As such, we're using a flag defined in the plugin itself to determine this.
This flag can indeed be dangerous since a stateless plugin has access to context from multiple teams. Thus on Cloud we need to really keep an eye on plugins that set this.
How did you test this code?
Added test, tried out manually.