Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Dagger.finish_stream() propagate downstream #579

Closed
wants to merge 56 commits into from

Conversation

JamesWrigley
Copy link
Collaborator

Previously a streaming task calling Dagger.finish_stream() would only stop the caller, but now it will also stop all downstream tasks. This is done by:

  • Getting the output handler tasks to close their RemoteChannel when exiting.
  • Making the input handler tasks close their buffers when the RemoteChannel is closed.
  • Exiting stream!() when an input buffer is closed.

One question, what exactly are the DropBuffer tests for? Are they just for testing DropBuffer (its usefulness is not clear to me), or to test that streaming tasks should throw exceptions when fetch()'d after cancelling? For now I've just disabled those tests because with the new behaviour a task stopping for any reason will also stop the downstream tasks. So currently in the tests the x task will reach max_evals and finish, which also cause A to finish, which means everything stops gracefully and A is never cancelled so it doesn't throw an exception.

jpsamaroo and others added 30 commits November 15, 2024 17:03
return_type() is kinda broken in v1.10, see:
JuliaLang/julia#52385

In any case Base.promote_op() is the official public API for this operation so
we should use it anyway.
This always us to handle all the other kinds of task specs.
This should get the docs building again.
Because it doesn't actually do anything now.
Using `myid()` with `workers()` meant that when the context was initialized with
a single worker the processor list would be: `[OSProc(1), OSProc(1)]`. `procs()`
will always include PID 1 and any other workers, which is what we want.
This is a bit nicer than commenting/uncommenting a line in the code.
Otherwise it may spin (see comments for details). Also refactored it into a
while-loop instead of using a @goto.
This is useful for testing and benchmarking.
This is currently necessary for the streaming branch, we'll have to change this
later but it's good to have CI working for now.
This works by converting the output buffers into a safely-serializeable
container and sending that to the new node.
This makes them be displayed as if they were running under the original task.
This makes the tests a little easier to control.
@JamesWrigley JamesWrigley self-assigned this Nov 16, 2024
- Added some whitespace.
- Deleted the unused `rand_finite()` methods.
- Allow passing the `timeout` to `test_finishes()`
- Fix bug in one of the tests where we weren't waiting for all the tasks to
  finish, which would occasionally cause test failures because of the race
  condition.
Previously a streaming task calling `Dagger.finish_stream()` would only stop the
caller, but now it will also stop all downstream tasks. This is done by:
- Getting the output handler tasks to close their `RemoteChannel` when exiting.
- Making the input handler tasks close their buffers when the `RemoteChannel` is
  closed.
- Exiting `stream!()` when an input buffer is closed.
`unwrap_nested_exception()` now supports `DTaskFailedException` so we can match
against the real exceptions thrown.
@jpsamaroo
Copy link
Member

I'm cherry-picking portions of this PR into #463. I'll also be implementing a waitany-based DAG teardown feature that should accomplish the same as this PR's intention but also propagating upstream too.

Thanks for this!

@jpsamaroo jpsamaroo force-pushed the jps/stream2 branch 2 times, most recently from a0b4a0c to d259f57 Compare December 4, 2024 00:31
@JamesWrigley
Copy link
Collaborator Author

Sounds good :) I'll close this then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants