Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve executor ergonomics #26

Closed
wants to merge 10 commits into from
Closed

Conversation

glommer
Copy link
Collaborator

@glommer glommer commented Sep 7, 2020

What does this PR do?

This PR greatly improves the ergonomics of creating and dealing with executors

Motivation

It really sucked before.

Checklist

[x] I have added unit tests to the code I am submitting
[] My unit tests cover both failure and success scenarios
[] If applicable, I have discussed my architecture
[x] The new code I am adding is formatted using rustfmt

Glauber Costa added 4 commits September 6, 2020 20:10
This is an ergonomics effort, so that we can easily know which executor
we are executing on.
Some functions that access the executor are available through Task,
which allow us to access the local executor without having a reference
to it (through its local thread local variable).

However Task's main role is to spawn a new task and store its result.
So it requires a type generic parameter, T.

For helper functions that don't spawn we have to write Task::<()>::
which is a hassle.

This patch adds Local, an alias to Task::<()>::
It is easy to spawn an executor in the current thread: all we have to do is
create it.

However, there are two main problems with it:

 1. The future that we create initially does not live in a task queue. This is
    pretty much a requirement, because the run() function just executes a future
    to completion. There are real problems arising from the fact that this future
    is not on a task queue. For example, we can't call later().await on it (as
    what that does is move ourselves to the back of the task queue).

 2. The potential of a thread-per-core system is only realized when there are
    many executors, one per each CPU. That means that the user now has to create
    thread, keep them alive, create an executor inside each thread, etc. Not fun.

To make this code more ergonomic, we will create the function
LocalExecutor::spawn_new().  It will create a thread and spawn an executor
inside it, making the creation of executors easier.

The future that the caller passed is executed inside a task queue, meaning
every Scipio function will work.  And he future that we actually pass to run,
is the result of draining that task queue.

When the user is done, the function wait_on_executors() can be called to wait
on all the executors created this way at once.

Like so:

LocalExecutor::spawn_new("hello", Some(0), async move {
    function_on_cpu0().await;
});

for i in 1..N {
	LocalExecutor::spawn_new("hello", Some(i), async move {
	    function_on_cpu_aux(i).await;
	});
}

LocalExecutor::wait_on_executors();

The next step in ergonomics for this is having a single entry point
that creates N executors and wait on them. However it is a bit hard
to figure out the details of how this should look like without having
inter-executor channels for communications. So we will defer.
This is already using the newly added ergonomic functions.
@glommer glommer requested a review from HippoBaro September 7, 2020 00:13
HippoBaro
HippoBaro approved these changes Sep 7, 2020
Copy link
Member

@HippoBaro HippoBaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice QOL improvement. however I don't like the fact that LocalExecutor::spawn_new stores the thread handle internally. I think it would be simpler to just return a handle to the newly created thread to the user so they can join() ant their leisure. That would allow you to get rid of SPAWNED_EXECUTORS and wait_on_executors as they both seem superfluous to me.

Couple of other comments, other than that, this is good to go.

scipio/src/executor.rs Outdated Show resolved Hide resolved
scipio/src/executor.rs Outdated Show resolved Hide resolved
scipio/src/executor.rs Show resolved Hide resolved
/// println!("my ID: {}", Task::<()>::id());
/// });
/// ```
pub fn id() -> usize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having later() or local() be proxies to the LocalExecutor is fine since those two methods either deal with or return a new Task. This one is just confusing.

I think we should have a way to access the local executor (ala engine()). Those utility functions, in turn, should call into that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that in Rust, you can only safely use thread local things inside a scope. You cannot have an engine() function like seastar. At most you could have with_engine(Function). (see the uses of with). Between those two things I prefer to provide individual functions.

scipio/examples/hello_world.rs Outdated Show resolved Hide resolved
scipio/examples/hello_world.rs Outdated Show resolved Hide resolved
Glauber Costa added 2 commits September 7, 2020 10:02
- spawn_new renamed to spawn_executor
- no longer provide a helper to join spawned executors
@glommer
Copy link
Collaborator Author

glommer commented Sep 7, 2020

@penberg @HippoBaro I have pushed two commits that fixes those issues.
Once you guys are cool with them I will fold them into the previous patches.


// The newly spawned executor runs on a thread, so we need to join on
// its handle so we can wait for it to finish
handle.join().unwrap();
Copy link
Contributor

@sitano sitano Sep 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly why making main() -> std::io::Result instead of expect() was a bad idea:

  --> scipio/examples/hello_world.rs:42:18
   |
22 | fn main() -> Result<()> {
   |              ---------- expected `std::io::Error` because of this
...
42 |     handle.join()?;
   |                  ^ the trait `std::convert::From<std::boxed::Box<dyn std::any::Any + std::marker::Send>>` is not implemented for `std::io::Error`

now the code has inconsistent error handling so that even main return type is useless because all other errors can't fit its type.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, the fact Rust has many incompatible variants of Result<> type is painful. Maybe its worth having your own Result<> and return everywhere 1 type and make it compatible with at least 1 std (std::io ie).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only a problem (and apparently one known to the rust community) with .join()
If you google this error message it should take you to places where the Rust folks tell you not to ? on join.
(although now that I am searching again I have been so far unable).

You will get similar results if you spawn a thread manually and then join it, so I am not too concerned.

My understanding is that it would be possible to make this better and ? on join if we implemented an error wrapper on io::Error that is send, but I am not too concerned about that either: if a thread fails in an unrecoverable way (at least for us), you are likely to want to panic the whole thing anyway: otherwise you are left with executors that are running while others have failed.

I feel like the example is good now: The user can see how to handle errors that happen at spawn time, and for asynchronous errors coming from threads we are recommending unwrap.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree about the errors, but in practice it seems to me that everyone standardized on io::Result so as long as your errors are convertible to and from it, all should be good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To split the difference, this example could just cheat and use Result<(), Box<dyn std::error::Error>> like the Tokio documentation does: https://docs.rs/tokio/0.2.22/tokio/runtime/index.html

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Daniel - as we discussed, this won't work in practice because JoinHandle transforms the error into a boxed std::any::Any whose size is not known.

I don't know of a very good way to ? from a join, but this is a problem that thread::spawn has already. We can fix the example when we find a good way (I'd like to), but tabling for now so we can make progress

@glommer
Copy link
Collaborator Author

glommer commented Sep 7, 2020

FYI: I wrote a slightly more complex example and I am having issues with the requirement of Send + Sync that I imposed in the future passed to spawn local. I will investigate the best way to solve it, but may delay this.

There is no rush

@glommer
Copy link
Collaborator Author

glommer commented Sep 7, 2020

Requiring Send + Sync at the interface levels is causing Rust to require the market for stuff that we use deep down the engine, which is not acceptable:

   ::: /home/glauber/scipio/scipio/src/executor.rs:387:12
    |
387 |         F: Send + 'static,
    |            ---- required by this bound in `scipio::LocalExecutor::spawn_executor`
    |
    = help: within `scipio::sys::Source`, the trait `std::marker::Sync` is not implemented for `*mut u8`

Glauber Costa added 2 commits September 7, 2020 12:19
The main difference in usage is that now instead of writing
"async move" we write "|| async move"

This is because we can't really require this future to be Send, or
it starts to requiring everything down the line to be Send too.

But we still don't want to just lift the requirement. So we require
FnOnce + Send.
scipio/src/lib.rs Outdated Show resolved Hide resolved

// The newly spawned executor runs on a thread, so we need to join on
// its handle so we can wait for it to finish
handle.join().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To split the difference, this example could just cheat and use Result<(), Box<dyn std::error::Error>> like the Tokio documentation does: https://docs.rs/tokio/0.2.22/tokio/runtime/index.html

scipio/src/executor.rs Show resolved Hide resolved
@glommer
Copy link
Collaborator Author

glommer commented Sep 7, 2020

Rewrote history into #27
(empty diff from the current HEAD here)

I feel like the version we have now address most of the comments, and the few points left can be done separately.

Thank you all!

@glommer glommer closed this Sep 7, 2020
@glommer glommer deleted the ergonomics branch December 21, 2020 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants