Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rule Request: Repetitive work in Stream.generate(Supplier) #454

Open
jankeesvanandel opened this issue Dec 9, 2024 · 1 comment
Open

Comments

@jankeesvanandel
Copy link

Recently we've found the following code in my application: (slightly modified to protect the innocent 😆 )

private <T> Stream<T> streamAndClear(Map<String, T> buffer) {
    return Stream.generate(() -> {
        Iterator<String> iterator = buffer.values().iterator(); // This line must be moved outside of the Lambda, because otherwise a new iterator will be instantiated for every item.
        if (iterator.hasNext()) {
            String entry = iterator.next();
            iterator.remove();
            return entry;
        } else {
            return null;
        }
    }).takeWhile(Objects::nonNull);}

The idea is to iterate over the values of a map and removing the elements after using them, to keep the memory pressure low. This is needed because there are tens of millions of objects in the map and there is substantial work happening with each item.

Accidentally the above code is functionally correct, thanks to the remove method, which makes the map smaller, so in the next iteration the Iterator will start from the next position, etc etc. But it's extremely expensive in our case. I didn't go far to find the exact cause, but the implementation in ConcurrentHashMap starts reasonably quick, but after some 100k it's getting noticeably slower and after 500k it's basically stuck, forwarding with only 1 item per second.

Moving the creation of the Iterator out of the lambda made the whole process super fast. I'm not sure if our case can be abstracted into a rule, but in general, expensive repititive work in such a lambda is killing for performance.

@stokpop
Copy link
Collaborator

stokpop commented Dec 13, 2024

@jankeesvanandel thanks for reporting, interesting case. We will think about a general rule that does repetitive but unneeded actions in lambda's that are used in loops such as streams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants