You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jul 18, 2024. It is now read-only.
This proposal is more of an aspiration or a roadmap goal than it is a current issue, and would be fitting for a 2.0 release rather than the initial drop, but i want to at least bring it some attention.
The current async iterator design has completely independent intermediate methods, which create new and opaque iterators for downstream consumption. This has its benefits (relatively simple implementation, well defined isolation and separation of concerns) but some notable drawbacks, largely in performance. I call it the "futures everywhere" problem, where every layer of transformation adds possibly several future operations to every element in the iterator, even for plain synchronous operations. It is difficult for the JVM to optimize async code in the same way that it can synchronous code, because of inlining challenges with the way that closures are applied to values with many steps in between, and reordering restrictions around atomic fields.
I propose changing the library iterators to use an integrated pipeline design, similar to that of j.u.Stream. Every intermediate method will be stored as an operation in the pipeline, then composed and executed at certain uncollapsible points (terminal operations, "real"/unavoidable async boundaries, manual user iteration). Efficient terminal operations will then depend on an underlying implementation of forEach which can apply the composed operations and possibly terminate early.
For example, some iterators traverse elements in batches where every Nth operation is actually async, and the rest are immediate stages over elements of some buffered collection in memory. A terminal method over this iterator (with an appropriate forEach implementation) could then apply most of its pipeline transformations in a plain loop over these collections -- where HotSpot is great at optimizing loops over virtual calls -- with only occasional async breaks and substantially less overhead.
This solution isn't a substantial improvement in all cases. Notably iterators where every element is accessed asynchronously, or where many transformations are async, could not leverage a collapsed terminal method; though they would see minor benefits from composed intermediate sync methods. It would be no worse in these cases however, and would greatly improve cases where synchronous iteration is possible.
Importantly, these changes can all be done within the library and don't require changes in user code, so there is no compatibility issue. Although a user's iterator would benefit from implementing such a forEach, everything would still work under the hood by falling back to the current implementation when necessary. User controlled iteration would also require a fall-back, where the pipeline must be composed for every poll.
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Porting Issue 20 from the enterprise repo
rnarubin commented on Aug 17, 2017
This proposal is more of an aspiration or a roadmap goal than it is a current issue, and would be fitting for a 2.0 release rather than the initial drop, but i want to at least bring it some attention.
The current async iterator design has completely independent intermediate methods, which create new and opaque iterators for downstream consumption. This has its benefits (relatively simple implementation, well defined isolation and separation of concerns) but some notable drawbacks, largely in performance. I call it the "futures everywhere" problem, where every layer of transformation adds possibly several future operations to every element in the iterator, even for plain synchronous operations. It is difficult for the JVM to optimize async code in the same way that it can synchronous code, because of inlining challenges with the way that closures are applied to values with many steps in between, and reordering restrictions around atomic fields.
I propose changing the library iterators to use an integrated pipeline design, similar to that of j.u.Stream. Every intermediate method will be stored as an operation in the pipeline, then composed and executed at certain uncollapsible points (terminal operations, "real"/unavoidable async boundaries, manual user iteration). Efficient terminal operations will then depend on an underlying implementation of forEach which can apply the composed operations and possibly terminate early.
For example, some iterators traverse elements in batches where every Nth operation is actually async, and the rest are immediate stages over elements of some buffered collection in memory. A terminal method over this iterator (with an appropriate forEach implementation) could then apply most of its pipeline transformations in a plain loop over these collections -- where HotSpot is great at optimizing loops over virtual calls -- with only occasional async breaks and substantially less overhead.
This solution isn't a substantial improvement in all cases. Notably iterators where every element is accessed asynchronously, or where many transformations are async, could not leverage a collapsed terminal method; though they would see minor benefits from composed intermediate sync methods. It would be no worse in these cases however, and would greatly improve cases where synchronous iteration is possible.
Importantly, these changes can all be done within the library and don't require changes in user code, so there is no compatibility issue. Although a user's iterator would benefit from implementing such a forEach, everything would still work under the hood by falling back to the current implementation when necessary. User controlled iteration would also require a fall-back, where the pipeline must be composed for every poll.
The text was updated successfully, but these errors were encountered: