Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streaming API #463

Merged
merged 23 commits into from
Dec 9, 2024
Merged

Add streaming API #463

merged 23 commits into from
Dec 9, 2024

Conversation

jpsamaroo
Copy link
Member

@jpsamaroo jpsamaroo commented Dec 21, 2023

Adds a spawn_streaming task queue to transform tasks into continuously-executing equivalents that automatically take from inputs streams/channels and put their result to an output stream/channel. Useful for processing tons of individual elements of some large (or infinite) collection.

Todo:

  • Migrate streams on first use
  • Add per-task input buffering to Stream object
  • Add no-allocation ring buffer for process-local put/take to Stream
  • Make buffering amount configurable
  • Add API for constructing streams based on inferred return type, desired buffer size, and source/destination
  • Allow finish_stream(xyz; return=abc) to return custom value (else nothing)
  • Upstream MemPool migration changes (Add DRef migration support JuliaData/MemPool.jl#80)
  • Add docs
  • Add tests
  • (Optional) Adapt ring buffer to support server-local put/take (use mmap?)
  • (Optional) Make value fetching configurable
  • (Optional) Support a waitany-style input stream, taking inputs from multiple tasks
  • (Optional) take! from input streams concurrently, and waitall on them before continuing
  • (Optional) put! into output streams concurrently, and waitall on them before continuing
  • (Optional) Allow using default or previously-cached value if sender not ready
  • (Optional) Allow dropping stream values (after timeout, receiver not ready, over-pressured, etc.)
  • (Optional) Add utility for tracking stream transfer rates (Add streaming throughput monitor #494)
  • (Optional) Add programmable behavior on upstream/downstream Stream closure (how should errors/finishing propagate?)

@JamesWrigley
Copy link
Collaborator

Am I correct in thinking that all the necessary items except for tests are complete?

@jpsamaroo
Copy link
Member Author

Generally yes, I think we're pretty close to this being merge-ready. There are some remaining TODOs that I need to finish, but most are reasonably small. I could definitely use help with writing tests - just validating that we can run various kinds of pipelines and that they work across multiple workers would be really useful.

@JamesWrigley JamesWrigley force-pushed the jps/stream2 branch 2 times, most recently from ed89a7f to 3274093 Compare August 3, 2024 16:57
@jpsamaroo jpsamaroo marked this pull request as ready for review December 3, 2024 22:41
@jpsamaroo jpsamaroo force-pushed the jps/stream2 branch 2 times, most recently from d259f57 to 59a0371 Compare December 4, 2024 17:49
jpsamaroo and others added 3 commits December 9, 2024 10:44
Co-authored-by: JamesWrigley <james@puiterwijk.org>
Co-authored-by: davidizzle <davide.ferretti.j@gmail.com>
@jpsamaroo jpsamaroo mentioned this pull request Dec 9, 2024
@jpsamaroo jpsamaroo merged commit 62f8307 into master Dec 9, 2024
8 of 11 checks passed
@jpsamaroo jpsamaroo deleted the jps/stream2 branch December 9, 2024 21:40
@jpsamaroo
Copy link
Member Author

Thanks so much to @JamesWrigley and @davidizzle for making this a reality! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants