-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Major memory leak on 1.2.4/1.2.5 #8307
Comments
Right now I'm going back to 1.2.3 and going to check if the problem is with my system or chia process. I will see in a few hours. |
ok, happy to see what the problem is not single |
I'm seeing this leak too on 1.2.4 and 1.2.5! |
I'm having this memory leak issue on 1.1.7 since the past week too. I'm on ubuntu 20.04 I tried upgrading some of my full-nodes to 1.2.5, but still having these memory leaks and eventual crash |
Can confirm this memory leak on Ubuntu on all versions I have used, 1.2.3-1.2.5 |
Has it been reported before? |
We are actively researching this |
Uhm, what is the ETA? |
I would imagine an issue like this is a priority, so whenever they can get a hot fix out. |
I would also like to imagine that this would be the case. However, if no ETA is provided (for any issue, not just this), than it is just a BS to get people focused on something else. Sorry, but how things/issues are being handled is not how software company runs. |
You're welcome to help fix the issue yourself, by submitting a pull request to this repo, but other wise you'll likely have to wait until the next patch version. They have said that they found what the problem is and are actively working on fixing it. If that's not good enough, I don't know what is. Why would they not solve a problem with their blockchain just to make people "focused on something else" when that would be detrimental to the network? Have patience... |
That this leak memory has existed for a few versions is not reassuring. But yes let's leave time to find a solution and patch :) |
Works both ways. Should I say that I am waiting for you to join the pull request efforts and will work with you on that? What is the point of it? We are both customers of Chia company. We purchased drives, plotters, do everything possible to help the ecosystem. There is no need to point fingers at people to do stuff, when the company is not doing their part. As @denisprovost stated "That this leak memory has existed for a few versions is not reassuring." Where was the QA that let 1.2.4 go out the door? Where was the QA to rush a broken 1.2.5 release. Why are we treated like alpha testers? So, are you really thinking that 1.2.6 will be "the working one?" Again, if all that you said is true, then what is the problem to say that "we will have it hopefully by Monday (or whatever makes sense)?" That is how ETA works, that you provide some timeline for people to calm down, schedule their time. I had other issues where the guy was just mudding the water to get by with other people reports. Again, providing the ETA is not a big deal, it doesn't compromise anything, it just let people better manage their time. Nothing more than that. Otherwise, it is just hurting the ecosystem that we all try to support. |
I am on Windows, and don't think that the problem exists on Windows. I run full-node 24/7, and don't see any crashes. Although, I am still on 1.2.3 (saw problems other people had with 1.2.4, then rushed 1.2.5 that was as good as 1.2.4, and decided to wait). It looks like the problem is more related to your setup (libs) than to OS or Chia version. I guess, it would be more people in this thread, if that issue would affect Ubuntu with a particular version, or some Chia version(s). Is it possible that all of you run some recent OS/library updates (and got for instance new python libs) that are causing those issues for all Chia releases? |
It's a fairly classic answer, you can imagine that before posting I checked, made tests etc and I bring results . I sincerely hope that the chia team will not see the problem from this angle. I am not the only one with problems like this. It may be my system, but it may not be. A stable system does not drift on its own Annonced fixed but no ;) |
This memory leak has happened to everyone I know who uses Ubuntu. |
I'll add my +1, it's happening to me as well. |
Don't take me wrong, I am on your side. I am not trying to say that you didn't check/test, or dismiss your results. The issue is real, but unless your setup is in a full debug mode, it is really hard to enable and read / relevant / understand the debug logs. Again, you are the main person pushing this issue, so it was really not my intention to imply that I doubt you, or want to dismiss it. That is basically the main reason I asked for the ETA (they can get QA to immediately run regression test on different platforms, and get the engineer in charge to focus on the most promising one - basically few hours, and they should know the offending part, and be able to provide projected milestones). One component that is actually not under our control is the UI part that potentially runs updated scripts/libs every other day or so (common practice with JS code, what Electron is using). Although, if that is the case, that potentially would influence other platforms as well - maybe not, as those libs may have localized issues. (Although, I didn't reboot my full node for several weeks, so no network residing scripts got updated for me.)
Well, how many people you know, since when they are affected, can they also chime in? The more info you can provide, the easier it is to narrow the scope. @ALTracer runs Gentoo, so that means it is not strictly Debian/Ubuntu related. @erickoh is still running v1.1.7, but based on what he wrote, he started having issues just a couple of weeks ago. That would imply that the issue is newer than his Chia version, or rather independent from Chia version. This is potentially the strongest statement pointing to some modified libs (again, either coming through some updates or Electron network scripts). Although, the issue is already well stated, and the Chia eng. team will have a busy weekend. So, we should lay it off for a while. |
We are actively testing what we believe to be a fix |
Until we confirm a fix, this is preliminary information: We believe this affects all versions on all platforms. We are currently duplicating this on testnet7 so we can verify |
We believe PR #8315 fixes this issue, for those that want (and understand how) to try the patch. |
Thanks, I have not experienced this memory leak problem at all over the past 24 hours |
After running in testnet for about 10 hours now we are pretty confident PR #8315 fixes this issue. I don't think we will rush out a release this weekend though. Since we have stopped generating the majority of compact vdfs in mainnet, we believe this problem has been largely stopped in mainnet as well (it may depend somewhat on which peers you connect to and how many compact vdfs are getting passed around - but not many are being newly generated right now) |
Thanks for your quick reaction. I have downgraded to 1.2.3 like many people, I will wait for a release like many people too. Please release when you are sure the problem is fixed. |
Downgrading to 1.2.3 offers no benefit over 1.2.5. Read a few posts up, the issue was blueboxes flooding all versions of nodes. For me since the blueboxes were shut down 1.2.5 and 1.2.5 with PR 8315 both have similar normal memory usage but hard to say if the issue is resolved as the cause has been removed. Testnet findings seem to wrap it up though. |
Yes but with 1.2.3, even if the problem is present (as I indicated above), my raspberry supports it better and does not crash after 6 hours of farming. The memory is very busy (65%) but remains stable at this level, which is not the case in 1.2.4 / 1.2.5 which ends at 100% and ends up crashing. A restart of my system with raspberry takes 40 minutes off-farm (which I find huge, 'SSL context Connect call failed 127.0.0.1' for 15mns). I can't afford to reboot too much, I leave those that have a quick reboot time and recync allowing them to validate the patch. I stay in 1.2.3 and wait for release when the problem will be really fixed, for the moment it is a 'test' patch which can only be used to confirm that this fixes the problem. If you have noticed that this fixes the problem for you, it is on the right track but for the moment, in my case, the safest is to wait for a real release which fixes the problem, which the majority of people do. But once again, thank you for the work of the chia team and the speed of reaction! |
Now my fullnode is stable, but my farmer process keeps crashing dmesg | grep -i memory |
It all depends on whether this full memory is linked to a gradual increase in ram. This happens after several hours or very quickly after a restart of the chia process ? |
Description
After chia synced and start farming, memory increased. 3hours -> +10%
After a few hours :

After to have stop chia process :

OS: Ubuntu 21.04 on Pi4 (full node + haverster)
RAM: 8 GB
Chia version: 1.2.4/1.2.5
The text was updated successfully, but these errors were encountered: