You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm having some uncontrolled memory consumption with thanos store.
What happened:
At start up, there is a peak of memory of ~40GiB that later decreases over time to ~26GiB. However, depending on the load, it reached back the peak value and even beyond (which means OEMKill).
even when i'm using a memcached precisely to avoid this situation.
I currently have more that 145k blocks in my s3 storage and more than 80 prometheus (+sidecars).
What I expected:
As per the use of a cache I expected the memory to stay low.
Do you have persistent storage on your Thanos Store pods (I assume k8s here)? The RAM usage probably comes from building binary index headers. Maybe there is some opportunity here to use sync.Pool to have a constant RAM usage. With persistent storage, you wouldn't have to rebuild them each time. Could you upload a profile of the memory usage of Thanos Store just after the start of the process?
We are actually using docker swarm with persistent volumes. I do not think there is a problem at this level because we have hundreds of services (not related to prometheus+thanos) running without any issue.
As per the memory, here we have the memory for the store using the bucket with 145k blocks after a reboot
and here we have a similar behavior of one of the stores sharded on three months (using min-time and max-time) after a reboot
Note:
This might be unrelated but the thanos compact is down because there are some overlaps.
Hello 👋 Looks like there was no activity on this issue for the last two months. Do you mind updating us on the status? Is this still reproducible or needed? If yes, just comment on this PR or push a commit. Thanks! 🤗
If there will be no activity in the next two weeks, this issue will be closed (we can always reopen an issue if we need!). Alternatively, use remind command if you wish to be reminded at some point in future.
Hello,
I'm having some uncontrolled memory consumption with thanos store.
What happened:
At start up, there is a peak of memory of ~40GiB that later decreases over time to ~26GiB. However, depending on the load, it reached back the peak value and even beyond (which means OEMKill).
![Capture d’écran 2022-05-16 à 17 16 56](https://user-images.githubusercontent.com/6917738/168626465-63ff3f3f-83b8-4c8d-b333-d699c8d07110.png)
even when i'm using a memcached precisely to avoid this situation.
I currently have more that 145k blocks in my s3 storage and more than 80 prometheus (+sidecars).
What I expected:
As per the use of a cache I expected the memory to stay low.
At least it seems the cache is used
current configuration
this is my cache configuration
Environment:
What have I tested so far?
1 - sharding by date using flags
I created several stores for 3 months periods using this pattern
but, all stores ended-up consuming the same amount of memory.
2 - changing the compact/level
I basically followed this issue #325
Anything else
I don't know if this information can be useful but the amt of samples changes greatly depending on my prometheus
The text was updated successfully, but these errors were encountered: