-
-
Notifications
You must be signed in to change notification settings - Fork 855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ImageSharp.Formats.Jpg.DecodedBlock ArrayPool is using up WAY too much memory #151
Comments
@DeCarabas Thank you for your detailed report! @JimBobSquarePants based on these issues it seems obvious to me, that we need to make our pooling configurable. My proposal: |
totally agree with @antonfirsov here. Lets our users limit the max pool size to use... the difficulty might be around the |
I'd take something as simple as the ability to reach in and tweak the individual pool settings. An overall memory usage policy sounds appealing but I think would be (a) hard to implement and (b) hard for me to really tune properly. Whereas I have a bunch of heap dumps and statistics around pool sizes that I can reason about for various pools... |
Individual pools could be customized by implementing custom @DeCarabas why do you feel a byte-measured limit is hard to tune? @tocsoft I would be much happier to operate on "typeless" generic |
ah well that's unfortunate 😞 |
We can certainly make our pool configurable but we have to look at a use case issue here also. Using 1GB ram for any serious image processing is simply a no-go. You'll run out of contiguous memory in the LOH pretty quickly if you work with multiple images. We're pooling yes but the pool is smart and will not grossly over-deliver on array requests. Making the pool more configurable will still lead to very similar memory consumption as it will end up allocating a new array anyway. I would like to double check our codebase though to ensure we're not missing something in Jpeg. I'm sure we're good but maybe we can make it leaner than it is. |
@antonfirsov I appreciate the desire to hide the pools from the caller, but then you're left with several unappealing options.
The problem with option 1 is that picking a good policy is hard-- how do you trade DecodedBlock off vs. Pixel? What if I consume the entire pool with DecodedBlock at the beginning of my program and then never it again, since I spend the rest of the time doing image manipulation on uncompressed pixels? The problem with option 2 is similar-- if you try to protect the pool for pixels by dividing the limit up "statically", then you waste by reserving memory even when you don't know the pool will ever be used. Why is the array pool for pixels smaller even if I never end up deocoding JPEGs? The problem with option 3 is that you haven't really hidden the number of pools from me-- you've just made it harder for me to figure out how I should set the limit, since the real limit is now the per-pool-cap times the number of pools. I can do this; I have the memory statistics, I've dumped the heap, I know where to tune, but you haven't hidden anything from me, really, just made it harder to figure out. |
@JimBobSquarePants Yeah I'm not entirely sure what I was trying to load that allocated 70MB of DecodedBlock; I probably need to do something to protect myself from that anyway. But the pool is not that smart-- it doesn't have any kind of idle shrinking capability, it doesn't shrink on memory pressure, and it has this annoying power-of-two thing that can end up wildly over-allocating in order to hit the bucket. The power-of-two thing is annoying because I was not necessarily using that 300MB concurrently-- if I needed an array of 131,073 DecodedBlocks I would have ended up allocating 262,144 of them. In the dump I'm looking at now, I have one such array at that size, for 70MB, and three at 131,072, at 35MB a piece. I only needed to get asked for an array of 65,537 blocks to allocate one of those... at those sizes I would much rather stop using the array pool and ask the allocator for an exact fit instead. |
@DeCarabas our main difficulties:
We can only solve these by thinking in a general way. What could be a fair compromise for me: we can introduce Also totally agree with @JimBobSquarePants: a fragmented LOH could also lead to |
@antonfirsov Totally understand on both points; I expect tuning parameters to be something weird and fiddly and likely to break at a moment's notice. This is always the difficulty with adding caches to libraries like this-- you can't meaningfully add a cache without being in complete control of the resources you're using for your cache, and your library is not in control over the memory in my app. Alas. Another couple of options:
Either of these would fix my problems without exposing the internals. (I'd personally take the hit of just relying on the GC more heavily over accidentally permanently consuming large blocks of memory.( |
In that case If/when we add the |
I think we would provide an easy way to disable pooling by setting |
If we are going to create an ArrayPoolProvider (sounds like a good idea) we should offer two implementations. One that reuses the pool (our default) and one that just creates a new one all the time. Otherwise we need null checks everywhere we use pools. |
Same problem here. Using a 13MB JPEG image resulted in 2.4GB private memory allocation. The second upload was even higher. This brings any server down. |
hmm ... maybe this isn't just normal ArrayPool behaviour, it can be an actual leak. @BrianThomson I'm planning to reproduce your use case within a console application. @dlemstra re ArrayPoolProvider: |
The issue could be here. We're copying the block since it's a struct then not returning the block back to the pool by disposing of it. |
@JimBobSquarePants Also not returning to the pool will result in GC-ing te array, and the issue is with the arrays that are actually kept in the pool. We need to benchmark + memory profile the ArrayPool behavour with big images + different image sizes to investigate this properly. |
@BrianThomson your problem seems to be pretty different. Actually, it's the opposite of @DeCarabas 's one: Try executing this, after you finished your request:
ImageSharp should not eat up 2.4GB decoding a single Jpeg, or at least I can't reproduce that case. With a 29MB Jpeg I was able to keep the memory consumption under 1.5GB using If your experience is different, a sample image with simple reproduction steps is really welcome! Also keep in mind, that our current Jpeg solution we have chosen an optimization which tries to reduce CPU load with a relatively large memory footprint. I'm in favor of this approach, because memory seems to be much easier to scale than CPU. |
I deleted the old test set, but created a new one. This time I use a 15MB JPEG image (600dpi). The results are a bit different, but memory consumption is to high for our web based use cases. In this test, I only load the file, no processing. •CommandLine: dotnet TestApp.dll •Runtime Version: V 4.0.22220.0 •CLR Startup Flags: None •Total CPU Time: 111.649 msec •Total GC CPU Time: 208 msec •Total Allocs : 2.098,568 MB •GC CPU MSec/MB Alloc : 0,099 MSec/MB •Total GC Pause: 10,8 msec •% Time paused for Garbage Collection: 3,1% •% CPU Time spent Garbage Collecting: 0,2% •Max GC Heap Size: 1.938,932 MB •Peak Process Working Set: 1.785,287 MB •Peak Virtual Memory Usage: 3.728,183 MB This is from Visual Studio Diagnostics Tool: |
@BrianJThomson can you share your test app (or a simplified version of your test app) together with the test image(s) you are using? It would be very helpful! @JimBobSquarePants This |
Below is the code. It's a vanilla netcoreapp1.1 console application: class Program { static void Main(string[] args) { using (Image image = Image.Load(@"test_15mb.jpg")) { // image.Resize(image.Width / 2, image.Height / 2); } } } Image: https://github.com/BrianJThomson/ImageSharp/raw/jpeg/test_15mb.jpg |
I can try to fix this by introducing a switch Making a fix that handles both corner cases (very large Jpegs + erroneous progressive Jpegs) seems to be really hard, and needs unpredictable amount of time. I won't be able to do it in the next 6-8 weeks. @JimBobSquarePants @BrianJThomson @DeCarabas thoughts? |
@antonfirsov, @BrianJThomson @DeCarabas This is a worrying predicament and one of many reasons why I hate the way Jpeg's have been dealt with in the past. In a perfect world there would be no history of broken images and we wouldn't have to hack our code to support them. A switch, I think would be a sticking plaster at best and I don't think that should be target outcome. I appreciate the difficulty here though and the work you've done so far @antonfirsov is extraordinary. Let's try to spread awareness off this issue before anyone makes a start on anything. If we are lucky we might catch the eye of someone who has had to deal with this before. It might even just need a fresh pair of eyes. I'll share it on Twitter now. |
@JimBobSquarePants I'm taking back my previous comments, because I got lost in details of the current implementation. We have to make one step back, because the problem is more general, and leads to hard CPU vs Memory design questions. A few numbers on
Here is the hard problem:
Personally I think memory is a cheaper, and more scalable resource than CPU, so in the general case it is better to optimize reducing the CPU load, even with a price of a larger memory footprint. Maybe we should find a general way to provide alternative implementations and guidelines for users having their memory as bottleneck. |
I agree CPU is probably the way to go, but I feel we need much better control over our ArrayPool usage and try to limit ourselves as much as possible to the types we allow ourselves to be requested from pools. We should limit our use of array pools to |
@antonfirsov Ever thought about using Memory Mapped Files for large files (temp file) in addition to ArrayPool for smaller files? With MMF you can limit memory usage without giving up the other goals (CPU). It's easy to implement and really fast since its just a wrapper. Windows itself uses it to load dlls and resources as you can see in a memory viewer. Maybe implement some kind of a simple MemoryController allocating memory from ArrayPool for smaller files and MMF views for larger files. It's available for netstandard1.3. https://msdn.microsoft.com/library/dd997372(v=vs.110).aspx |
Some comments:
|
Oh, and I still would like control over the buffer pool policy since its current maximum size and pool size policies are not what I actually want to run with. (Given the resources I have, allocating a 20MB buffer for an 11MB request is not practical.) |
Did some proof-of-concept coding. My current experience is from the I created a small Below is the memory profile from the image @BrianJThomson used above now backed by memory mapped files. Look at Possible problems:
So, it's possible to implement Memory Mapped Files in this library, but this comes at a price (performance, reliability?). I think there are more important things to do now. |
@blackcity nice work!
I might be naive, but I think if it's the part of the standard, than it should be OK.
We should avoid parallel processing when operating on MMF buffers. It's not hard to do so in Jpeg code.
Do you think it's possible to integrate it with our core memory buffer class? (It will not be pinned, I'm currently removing all the pointers.) |
Yeah, it's just that I am always tense developing things for one platform and then run it the first time on another. By the way,
It just needs to be carefully designed and benchmarked. For example: If we have a fixed memory area in a view multiple threads can work on it without problem as long as there are no concurrent read/write operations (Mutexes should be avoided of course). If, on the other hand, multiple threads need to walk through the whole memory or large parts, it needs a different strategy. But there are well known design patterns for this.
Seems good to me. |
Great to see progress on this issue! |
Can you show how you ar using the library? There should be nothing that doesn't either get returned or handled by GC. |
Stream ResizeFunction(Stream stream)
My Service Resize Small Image File (< 2Mb) |
Nothing wrong with the code as long as you clean up that outstream instance. I know of no memory leaks within the library and I've tested it a lot to ensure that. Are you running in 32 or 64 bit mode? |
Hi @JimBobSquarePants I use procdump -ma pip (w3p.exe) |
I've been thinking on this issue for weeks. It is a hard one not only because the tricky implementation details. Defining a public memory management API which allows the right customization options is a non-trivial design task, because the solution has to be maintainable and future-proof. I think I managed to came up with a design which meets all the user requests described in this discussion, and also keeps things extensible. I'd like to introduce a
It doesn't deal with implementation details, only with the API. To integrate the MMF stuff, we need to change @JimBobSquarePants @tocsoft @dlemstra @blackcity @DeCarabas can you have a look? |
Looking good to me. Only thing I can see is I think |
👍 Bravo. This looks very comprehensive. Agree on the naming change. Quick Q on this comment. I'm assuming it has to do with .NET Core 2.0 making
|
I think only
I made a big mistake in the proposal ignoring the fact that |
I'm gonna use it being nearly 1am here as an excuse for missing that! 😖 |
@antonfirsov In ImageMagick we swap to disk when we run out of memory. Maybe we could make the MemoryManager do that automatically? |
@tocsoft @JimBobSquarePants I also have an excuse: never should make gists before lunch! :) Updated it now. Things became a bit more complex around the pooling manager. @dlemstra It might be possible to implement a |
Looks good. I like that you provide extensibility, because the library itself cannot implement optimized code for all conceivable runtime environments, 👍 |
@DeCarabas @BrianJThomson If you are still using ImageSharp, check out the beta (on NuGet!), the Jpeg decoder uses much less memory now! Let me know if the situation is still critical for you! |
@antonfirsov did you mean SixLabors.ImageSharp 1.0.0-beta0002? |
Beta 1 contained the relevant changes, beta 2 builds on those changes |
Prerequisites
DEBUG
andRELEASE
modeDescription
I'm basically reporting issue #123 again, but with a slightly different spin. I'm running my little web app on a VM with 1GB of RAM, and I'm constantly getting killed with out of memory because of the ArrayPool being used for ImageSharp.Formats.Jpg.DecodedBlock[].
My app concurrently fetches RSS feeds, then for each entry in each RSS feed concurrently fetches all the images and evaluates all the images for a given entry to see which one will make the best thumbnail. I don't need to get into the large number of feeds before I'm decoding a lot of images in parallel.
After a single fetch session, I dumped the process and examined the heap; I discovered that ImageSharp.Formats.Jpg.DecodedBlockArray.ArrayPool was holding on to 337MB all on its own; the largest entry was 262,144 elements long for a size of 70,254,616. (Obviously that's just the current state of the pool; as currently configured by ImageSharp it can grow way past that.)
I need some way to get a handle on this memory usage; I'm looking at other constraints but some kind of configuration to let me limit the pool would be really nice.
Steps to Reproduce
N/A
System Configuration
The text was updated successfully, but these errors were encountered: