Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fetch() BlobModule.java memory leak fix not working if enable hermes js engine #164

Closed
jgreen210 opened this issue Dec 16, 2019 · 18 comments

Comments

@jgreen210
Copy link

jgreen210 commented Dec 16, 2019

We've been seeing memory leaks from the mBlobs HashMap in BlobModule.java for some time (with JavaScriptCore). This is one of the tickets for this:

facebook/react-native#20352

We were expecting this to be fixed when we upgraded react native, since 0.61.3 added the following fix:

facebook/react-native@766f470

I.e. the js blob HostObject destructor is supposed to call remove() in BlobModule.java.

We found that if we enabled the hermes engine, we were still getting java-heap out of memory errors.

Setting breakpoints in BlobModule.java in android studio shows that store() is called for both JavaScriptCore and hermes:

https://github.com/facebook/react-native/blob/v0.61.5/ReactAndroid/src/main/java/com/facebook/react/modules/blob/BlobModule.java#L172

...but that remove() is only called if use JavaScriptCore:

https://github.com/facebook/react-native/blob/v0.61.5/ReactAndroid/src/main/java/com/facebook/react/modules/blob/BlobModule.java#L184

I.e. the memory leak is fixed, but only if use JavaScriptCore.

react native 0.61.5

@tmikov
Copy link
Contributor

tmikov commented Dec 17, 2019

As I understand it, the "blob collector" object is attached as a property to the blob, so when the blob is garbage collected, the "blob collector" will also be destructed and it will remove the blob. I see no reason why it shouldn't work in Hermes. Can you provide more details about the use case? Where are the blobs stored in JS, for example?

We recently fixed a memory leak in WeakMap in d09005e. It could theoretically be related.

@tmikov tmikov added the need more info Awating additional info before proceeding label Dec 17, 2019
@jgreen210
Copy link
Author

The use case is any fetch() calls that make use of "blobs". That includes code that calls json() on the fetch() Response. I.e. most react-native apps will trigger this memory leak.

@jgreen210
Copy link
Author

jgreen210 commented Dec 19, 2019

The js blobs would typically just be passed between the Promises implementing the fetch() and json() parsing.

Since I could see BlobModule.java's remove() getting called if I set enableHermes to false, it looks like the problem must be in something that is called iff enableHermes is set to true - i.e. hermes and/or a perhaps a few small parts of react native.

@jgreen210
Copy link
Author

jgreen210 commented Dec 19, 2019

I don't think (JS)WeakMap is going to be relevant here - the js blobs are handles for java-owned network content. They need to be strongly-reachable from the fetch() Response.

@tmikov
Copy link
Contributor

tmikov commented Dec 20, 2019

Does this happen both in development mode (when running from source) and production mode (when running from optimized bytecode)?

jgreen210 added a commit to jgreen210/ReactNativeFetchBlobLeak that referenced this issue Dec 20, 2019
jgreen210 added a commit to jgreen210/ReactNativeFetchBlobLeak that referenced this issue Dec 20, 2019
jgreen210 added a commit to jgreen210/ReactNativeFetchBlobLeak that referenced this issue Dec 20, 2019
@jgreen210
Copy link
Author

jgreen210 commented Dec 20, 2019

I made a small app that crashes with OutOfMemoryError in debug and release build modes after a couple of hundred MiB if use the hermes engine. For debug, with Android Studio profiler, you can see the java heap growing linearly prior to the crash.

It doesn't crash if use JavaScriptCore (I only tested debug in that case, but assume release would work too).

https://github.com/jgreen210/ReactNativeFetchBlobLeak/tree/1645c42de29ffb8fd05e7dd1f7fae74a7e7125a1

@jgreen210
Copy link
Author

Apart from Api.js, the app is just react-native init boilerplate.

@jgreen210 jgreen210 changed the title fetch() BlobModule.java memory leak fix not working if enable hermes fetch() BlobModule.java memory leak fix not working if enable hermes js engine Jan 8, 2020
@jgreen210
Copy link
Author

I think the "need more info" label can be removed now.

@jgreen210
Copy link
Author

jgreen210 commented Feb 26, 2020

I upgraded the repo that demonstrates this bug to react native 0.62.0-rc.3 (hermes 0.4.0) and it still crashes with a java OutOfMemoryError for both debug and release builds.

If I switch the app from hermes to JavaScriptCore, it doesn't crash.

https://github.com/jgreen210/ReactNativeFetchBlobLeak
c86e6b629b96ebb0aa22c92421257dc53cbc9530

@willholen willholen removed the need more info Awating additional info before proceeding label Feb 27, 2020
@willholen
Copy link
Contributor

I am indeed able to reproduce this, including on today's masters (RN b8715dc, Hermes 6d67507), for both Debug and Release.

(Internal test URL for a 1MB download without rate limiting: https://dewey.vip.facebook.com/blob/34aa973cd4c4daa4f61eeb2bdbad27316534016f)

@Kudo
Copy link
Contributor

Kudo commented Mar 5, 2020

I had some findings and posted at Kudo/react-native-v8#36.
I would not say it's a JS VM's problem.
Applications should take responsibilities to free unused resources as early as possible.

@jgreen210
Copy link
Author

It leaks if just use fetch() and json() - that's all our production app does, and I confirmed that it leaks too in the demo app when I first wrote it. It would be possible to fix this in application code today, with some code that extracts the Blob then calls close() on it. Most people won't know they need to do this, they'll just do fetch() then json().

Looking at node_modules/whatwg-fetch/fetch.js 3.0.0, text() would be affected too - json() is implemented on top of that.

With @Kudo's Kudo/react-native@86ac876 promise-chain fix added to react native, it would be possible to set the auto-close mode when use fetch(), but only in application code, not in library code (unless enable it by default, which would cause the problems mentioned in that commit). Can't patch every library that uses fetch - there are obviously too many, and most aren't react-native specific. Would have to audit them all for the potential bugs this can cause.

I think that means that while it would be good to use React Native's Blob close() more, it can't be the only fix here. I.e. I think do need to do more js garbage collections, e.g. by:

@Kudo
Copy link
Contributor

Kudo commented Mar 5, 2020

@jgreen210 AFAIK, fetch will only create blob if the response is blob like.
https://github.com/github/fetch/blob/master/fetch.js#L217

As usual plain text or application/json Content-Type, it seems not to go for the Blob case.
Could you show some cases for fetch requests without explicitly call blob(), but with underlying blob created?

@jgreen210
Copy link
Author

@Kudo, I'll have to look at this properly next week now. I originally made the demo app use just json(), but changed to blob() so I could show the number of bytes downloaded. I should change it back, or allow either.

I've seen com.facebook.react.modules.blob.FileReaderModule.readAsText in the java OutOfMemoryErrors we get in our real app, so something we do is using Blobs. I don't have a java backtrace from the example app to hand, but see e.g. what I say in this issue's description about breakpoints - something is using BlobModule.java in at least debug builds. Seeing this crash in release builds means can rule out the bundler connection causing a Blob leak.

Maybe @willholen can share what API is being tested and what's in the backtrace in the facebook test case?

@willholen
Copy link
Contributor

As far as I can tell, the problem is unrelated to the blobs, they just happen to use enough space to make the problem visible. The underlying issue is that the XMLHttpRequests holding on to them are not being collected. I identified a set of objects in a heap snapshot whose only path back to a GC root appears to be as both key and ultimately value in a WeakMap (listenerMap in event-target-shim). If true, they're definitely collectable.

My current theory is that it's a bug in the GC itself (or maybe in heap snapshot tooling), so I've asked the resident GC experts to have a look.

PS: Thanks for the repro @jgreen210, it's been super helpful!

@willholen
Copy link
Contributor

As it turns out, this was an issue fixed in 84d61df in December, but after some suspected crashes, the fix was rolled back and then gated. v0.4.0 happened to be made during this window.

To complicate the investigation, the issue is fixed on master but this was not immediately obvious because: 1. Hermes does not track the Blob's native memory usage, and therefore doesn't GC in time to prevent a OOM, 2. Even if manually triggering GCs, the WeakMap releases values as part of its internal cleanup and not immediately on GC, so the values are held for longer than expected.

The next Hermes release will not have this issue, and we're looking into fixes for the issues giving false impressions of leaks as well. We're hoping to have a v0.4.1 for easy upgrades with the imminent react-native 0.62.0

@tmikov
Copy link
Contributor

tmikov commented Mar 31, 2020

0.4.1 was just released, closing.

@tmikov tmikov closed this as completed Mar 31, 2020
@willholen
Copy link
Contributor

Note that a blob from the JS side is very small, because the actual binary data is stored independently by Java. This means that an app like the repro app that only fetch()es and runs no significant JS can end up allocating 0.1% of the JS heap, but also 100% of the Java heap, causing a crash because Hermes never feels that 0.1% usage warrants a GC.

This is not a problem in a more typical app, which downloads data and processes it in some way, because processing of data will cause regular collections.

In the aforementioned very specialized cases, you can trigger a GC manually with global.gc(). We're actively adding additional indicators for the GC so that this won't be an issue in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants