Rule Request: Repetitive work in Stream.generate(Supplier) #454

jankeesvanandel · 2024-12-09T20:01:16Z

Recently we've found the following code in my application: (slightly modified to protect the innocent 😆 )

private <T> Stream<T> streamAndClear(Map<String, T> buffer) {
    return Stream.generate(() -> {
        Iterator<String> iterator = buffer.values().iterator(); // This line must be moved outside of the Lambda, because otherwise a new iterator will be instantiated for every item.
        if (iterator.hasNext()) {
            String entry = iterator.next();
            iterator.remove();
            return entry;
        } else {
            return null;
        }
    }).takeWhile(Objects::nonNull);}

The idea is to iterate over the values of a map and removing the elements after using them, to keep the memory pressure low. This is needed because there are tens of millions of objects in the map and there is substantial work happening with each item.

Accidentally the above code is functionally correct, thanks to the remove method, which makes the map smaller, so in the next iteration the Iterator will start from the next position, etc etc. But it's extremely expensive in our case. I didn't go far to find the exact cause, but the implementation in ConcurrentHashMap starts reasonably quick, but after some 100k it's getting noticeably slower and after 500k it's basically stuck, forwarding with only 1 item per second.

Moving the creation of the Iterator out of the lambda made the whole process super fast. I'm not sure if our case can be abstracted into a rule, but in general, expensive repititive work in such a lambda is killing for performance.

The text was updated successfully, but these errors were encountered:

stokpop · 2024-12-13T13:27:18Z

@jankeesvanandel thanks for reporting, interesting case. We will think about a general rule that does repetitive but unneeded actions in lambda's that are used in loops such as streams.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rule Request: Repetitive work in Stream.generate(Supplier) #454

Rule Request: Repetitive work in Stream.generate(Supplier) #454

jankeesvanandel commented Dec 9, 2024

stokpop commented Dec 13, 2024

Rule Request: Repetitive work in Stream.generate(Supplier) #454

Rule Request: Repetitive work in Stream.generate(Supplier) #454

Comments

jankeesvanandel commented Dec 9, 2024

stokpop commented Dec 13, 2024