-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Set redis maxmemory and memory policy for docker compose #9573
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good overall but I'd put the limit higher than that.
We cache things like insight results which take a lot more memory.
let's maybe do 200mb
This might be fine, but it's worth thinking if we are using Redis for other uses than caching, where we don't need a guarantee that keys will stick around. When looking at e.g. the snowflake exporter it looked like we were using it for things where we cared about durability, but I could be wrong. If not, it would make sense to have something that is explicitly a cache, and another that has the durability requirement. Is there anywhere we expect durability @tiina303 @yakkomajuri @guidoiaquinti ? |
🎯 we should expect the data in Redis to be ephemeral. If we want data persistence we should use another datastore. |
And we already have postgres, so it's not like it would be hard to use another non-ephemeral data source. That said we'll look into what we're currently putting into Redis and the assumptions around that. |
Yeah Redis data should be ephemeral. In the Snowflake case the usability of the Redis APIs makes up for the fact that we might lose a batch here or there if Redis goes down. In any case the user will have the files in S3 and can manually import them. It's a tradeoff |
I would not trade correct functionality for ease of use and future self or others scratching their heads in #community-support, we should be kind to support hero 🥰 But I don’t have context on how important snowflake is and what the failure cases are, so I will defer to others to make the call. |
Looked at the usage:
I couldn't see the other functions being used in code, unless we do some trickery with the way we call these functions, e.g. redisLLen https://github.com/search?q=repo%3Aposthog%2Fposthog+redisLLen&type=code - only the definition atm. |
|
Based on plugins currently used on cloud https://metabase.posthog.net/question/310-plugins-currently-enabled-on-cloud
|
Ah sorry, didn't write the rest on 2 🤦 So from a practical perspective, we have a few options: 1. EasiestStart with the initial suggestion of Problem: Our Cloud cluster currently operates on 2. BetterFix the relevant stuff out of what you've mentioned. I think this is a good exercise for us in infra upkeep. Will pay dividends in the long run to do these refactors. Also a great opportunity for you to work with plugins (I love that you've been reading plugin code to compile this!). Thus, let's probably go with 2. Here's my opinion for what needs fixing, in some order of priority:
1 and 2 are very important. Everything else is a lot less important. Happy to discuss why on individual cases if you like. But e.g. Postgres import is not live anywhere + doesn't work, hello world doesn't matter. Redshift import is Alpha, etc. |
As for docs, we can make things more clear, but in general one should expect cache to be ephemeral, although some of us (myself included!) have cut corners and relied a bit extensively on cache |
Actually this won't help us because most of the usages here do set expiry times, which means they can be expired early, only keys that won't have expiry won't be expired. So I'd rather just fix it to work like a cache should than remove the expiry there and adjust to that assumption. So agreed to go with approach 2
How did we verify this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems sane for local + hobby
Looked into mmdb again, we actually use postgres only at the end of the download, so we can't use that for lookup. So there isn't an easy way to delay downloading if they key expired faster than the 120s expiry. But this should be fine. |
Problem
We're starting to add a lot more data to Redis for persons on events.
We need to make sure we don't run out of memory. Discussion here
Picked 100M based on https://redis.io/docs/getting-started/faq/#whats-the-redis-memory-footprint
Changes
👉 Stay up-to-date with PostHog coding conventions for a smoother review.
Set the max memory for all of our docker compose files to 100M & eviction policy to allkeys-lru
How did you test this code?
Before:
After