Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC] Prototype async query functions #164

Closed
wants to merge 21 commits into from
Closed

Conversation

Marwes
Copy link
Contributor

@Marwes Marwes commented Apr 30, 2019

Wanted to check how much code would need to be added/duplicated to
support async queries. This isn't a a full implementation (I haven't
even tested to call async functions) but the normal sync implementation
still work with this with what shouldn't be too much overhead.

Not really interested in this until async/await gets to stable but
figured a PR with this could still be useful.

@nikomatsakis
Copy link
Member

I've been wondering for some time whether salsa should be based on async fn. I decided to start with just threads but I guess as async fn approaches stable this starts to be more plausible.

@Marwes
Copy link
Contributor Author

Marwes commented May 6, 2019

Using plain sync functions is likely enough for most uses but I would really like to use async throughout the compiler to allow procedural macros to be async.

Arguably async queries does make cancellation cleaner since it could be done as just not continuing to execute the future (for a query to check if its result is still needed it only needs to yield control).

This experiment makes me quite confident both can supported with only a little overhead for the sync case however.

@nikomatsakis
Copy link
Member

Thanks for doing this @Marwes -- I'm wondering how much nicer it would be once rust-lang/rust#61775 lands (which should remove the need for lifetime parameters).

@Marwes
Copy link
Contributor Author

Marwes commented Jun 26, 2019

Since it implements trait methods it needs to return a Box<Future + 'a> in most cases I don't think that would help much?

@nikomatsakis
Copy link
Member

@Marwes ah, ok, perhaps not.

@Marwes Marwes force-pushed the async branch 2 times, most recently from c3766f7 to d1b6867 Compare October 13, 2019 19:47
@Marwes
Copy link
Contributor Author

Marwes commented Oct 13, 2019

Rebased this again. If async queries is something that is wanted I will try and get something decent done in time to the async/await release.

  • Did some changes from the first implementation so that this actually has a test that exercises an async query.

  • The async query -> sync query layer requires an Mutex + Condvar on every query which isn't necessary. Should be fixable by parameterizing the query by a sync/async channel type.

  • There aren't any error checking for async input queries which don't really make sense.

  • Async transparent queries should be easy to add.

  • Currently only non-send futures are returned which isn't workable in a non toy example. Might need some unsafe here to get a BoxFuture which is Send/!Send depending on the key/database.

@Marwes
Copy link
Contributor Author

Marwes commented Oct 14, 2019

So there is an issue the LocalState type not being Sync. Since & references to the database are passed around this causes any future that contains such a reference to be !Send.

If I remember correctly, this &/&mut encoding in the database is "just" to prevent mutation to the inputs to happen during a revision? Is there any way that can be preserved without needing internal mutability in the LocalState so that this !Send issue can be fixed?

fn assert_send<T: Send>(t: T) -> T {
    t
}

async fn function(_: &AsyncDatabase) {}

#[test]
fn test_send() {
    assert_send(function(&AsyncDatabase::default()));
}
   Compiling salsa v0.13.0 (C:\Users\Markus\Dropbox\Programming\salsa)
error[E0277]: `std::cell::RefCell<std::vec::Vec<salsa::runtime::ActiveQuery<AsyncDatabase>>>` cannot be shared between threads safely
  --> tests\async.rs:50:5
   |
42 | fn assert_send<T: Send>(t: T) -> T {
   |    -----------    ---- required by this bound in `assert_send`
...
50 |     assert_send(function(&AsyncDatabase::default()));
   |     ^^^^^^^^^^^ `std::cell::RefCell<std::vec::Vec<salsa::runtime::ActiveQuery<AsyncDatabase>>>` cannot be shared between threads safely
   |
   = help: within `AsyncDatabase`, the trait `std::marker::Sync` is not implemented for `std::cell::RefCell<std::vec::Vec<salsa::runtime::ActiveQuery<AsyncDatabase>>>`
   = note: required because it appears within the type `salsa::runtime::local_state::LocalState<AsyncDatabase>`
   = note: required because it appears within the type `salsa::runtime::Runtime<AsyncDatabase>`
   = note: required because it appears within the type `AsyncDatabase`
   = note: required because of the requirements on the impl of `std::marker::Send` for `&AsyncDatabase`
   = note: required because it appears within the type `[static generator@tests\async.rs:46:38: 46:40 __arg0:&AsyncDatabase {}]`
   = note: required because it appears within the type `std::future::GenFuture<[static generator@tests\async.rs:46:38: 46:40 __arg0:&AsyncDatabase {}]>`
   = note: required because it appears within the type `impl core::future::future::Future`
   = note: required because it appears within the type `impl core::future::future::Future`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0277`.
error: could not compile `salsa`.

To learn more, run the command again with --verbose.
[Finished running. Exit status: 0]

@frondeus
Copy link

I also was thinking a little bit about async queries - and I think it could be nice to wait for structured concurrency.

See example: tokio-rs/tokio#1879

@frondeus
Copy link

About mutability - I think maybe we could be split methods into two traits Database and DatabaseMut.

Database would be used in queries - Getting inputs from storage, executing other queries etc. while DatabaseMut in the main thread only - setting inputs.

@Marwes
Copy link
Contributor Author

Marwes commented Dec 20, 2019

I also was thinking a little bit about async queries - and I think it could be nice to wait for structured concurrency.

That might be nice since it could make it possible to ensure that sub-queries are done before the caller returns! Currently that requires an Arc + a runtime assertion in Drop impl to make a best effort check for it.

About mutability - I think maybe we could be split methods into two traits Database and DatabaseMut.

Yep, I am leaning towards that as well. It is unfortunate that it infects the current, sync queries as well but I suspect that they will need that anway if they are to support running sub-queries in parallel. (https://salsa.zulipchat.com/#narrow/stream/145099-general/topic/Sync.20database)

Wanted to check how much code would need to be added/duplicated to
support async queries. This isn't a a full implementation (I haven't
even tested to call async functions) but the normal sync implementation
still work with this with what shouldn't be too much overhead.

Not really interested in this until async/await gets to stable but
figured a PR with this could still be useful.
DB: Database,
{
fn drop(&mut self) {
if !std::thread::panicking() {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no unsafe invariant to uphold forgetting the Forker just means that the query may end up in a bad state but there is no memory unsafety that is upheld by the drop.

@Marwes
Copy link
Contributor Author

Marwes commented Dec 29, 2019

So the PR as is basically works but it is in a rough shape. Highlights/issues atm.

  • Send + Sync is a requirement now on keys/values. This is forced due to needing to box the returned futures in all the traits that salsa defines. I don't think this is a big issue when using async queries since you almost certainly want Send futures anyway but it ends up impacting sync queries as well and I had to remove tests or replace non-Sync types in them to get things working.

    I would expect that non-Sync/Send types as keys and values to be fairly niche since any such queries can't be parallelized so maybe it isn't that big of a problem, not sure. It might be possible to restore non-Sync/Send for sync queries with some trickery (but no unsafe) or with much trickery and unsafe for both sync and async queries if it is desired however (I can write on that separately though).

  • The generated database trait now takes &mut self on all queries as the RefCell in the query local runtime were removed to achieve Send futures. All the actually mutating methods (set_input etc) were moved to a *Mut sibling trait which should ensure that those methods can't be called inside of a running query.

    While taking &mut in queries is a bit less intuitive and perhaps a bit more error prone I do believe this can be an improvement for sync queries as well. While the RefCell does ensure that sub-queries can't be run concurrently with threads (which would break the local query stack), it does so in a clumsier way than a &mut reference. The &mut reference does still prevent concurrency, but it still allows the query to be run in a different (scoped) thread which is perfectly correct.

    Further, even if the RefCell were kept and queries were forced to return non-Send futures, that still wouldn't be correct as it would be possible to invalidate the stack with future combinators such as join! (whereas &mut correctly prevents this).

  • Query forking has been added which lets multiple sub-queries be run concurrently as long as they all complete before the caller returns. This should work for parallel sync queries as well but I have only tested it with async.

    The forking currently use an Arc to remove any need for lifetimes since running async queries on a thread poll (tokio::spawn etc) require 'static. For sync queries it may be better to offer an alternative which just borrows as it could be used to statically ensure that all queries run an complete before the fork-scope ends instead of doing this check dynamically via Drop.

  • The cycle detection is now in an awkward place as it now needs to check that neither the current runtime id, nor any parent runtime ids are in a cycle. It works best I can tell but I think this should use a real graph (https://crates.io/crates/petgraph ?) + cycle detection now instead. More or less fixed. Forks add additional edges to the subqueries and the cycle detector recurses down each of the sub-queries.

  • I had to contort the code quite a bit to ensure that locks aren't observed by the borrow checker across .awaits. Might be able this somewhat if async aware locks were used instead but there is currently no futures aware RwLock as far as I know.

  • futures based channels are used, even in sync queries. This could be fixed another abstraction so that the query specifies the channel type depending on if it is async or not. The CondVar based mini-executor should be possible to simplify such that it can just panic on wakes through that as well.

  • Some types are probably exposed more than they shouldn't be due to the aforementioned contortions

I could use some feedback at this stage if/when there is some time. The fork + cycle check changes could plausibly be extracted and worked out separately and could solve #80 .

@nikomatsakis
@matklad

@nikomatsakis
Copy link
Member

Heh, @Marwes, I am always so negligent in providing feedback to your PRs. Thanks for the helpful summary. I'll try to give this a more detailed work. Some of the notes sounded mildly worrisome to me, but I have to look more deeply at the code to form a real opinion I guess.

@Marwes
Copy link
Contributor Author

Marwes commented Jan 13, 2020

The cycle detection has been fixed (updated the notes) at least.

@nikomatsakis
Copy link
Member

I haven't had time for a detailed look, but thinking more about it I am quite nervous about using &mut self for query methods. That introduces quite a break with the compiler's query system, for one thing, and I still have some hopes of bringing salsa plus the compiler together, and I think it's going to be quite inconvenient. Experience in rustc suggests that it is very useful to be able to thread the "context" around quite freely, rather than having to track it linearly all the time.

I guess one alternative is to use a proper lock? Or perhaps a async-aware lock?

@Marwes
Copy link
Contributor Author

Marwes commented Jan 14, 2020

For async it is not enough with a lock, quoting myself.

Further, even if the RefCell were kept and queries were forced to return non-Send futures, that still wouldn't be correct as it would be possible to invalidate the stack with future combinators such as join! (whereas &mut correctly prevents this).

For synchronous code, a !Sync bound is enough to prevent concurrent uses. But that is not enough for asynchronous code.

But, if &mut is a bad idea for sync queries it is likely possible to have an API where only async requires &mut. Easiest way would be to just switch the RefCell to a Mutex while still having async queries requires &mut but we can probably do something better than that.

@Marwes
Copy link
Contributor Author

Marwes commented Jan 14, 2020

Another, alternate API may be to make Runtime cheaply "cloneable". Then code that requires & can just pass the database around by & and call queries like db.clone().query(123). That may have some unacceptable overhead however.

@nikomatsakis
Copy link
Member

@Marwes

OK, I see, maybe I hadn't deeply through the implications. You're saying that if you do

let x = db.some_query();

you don't necessarily await the query right then -- hence something like let x = join!(db.some_query(), db.some_other_query()).await could be a problem?

This is a problem specifically for our internal stack tracking, I suppose, which is quite stateful... It's not (at least not obviously...) a problem from a more "theoretical" point of view, is it?

I guess we have to be able to deal with the future being dropped, but that seems no different than generally being panic safe.

@Marwes
Copy link
Contributor Author

Marwes commented Jan 15, 2020

hence something like let x = join!(db.some_query(), db.some_other_query()).await could be a problem?

Precisely.

This is a problem specifically for our internal stack tracking, I suppose, which is quite stateful... It's not (at least not obviously...) a problem from a more "theoretical" point of view, is it?

Nope. Only the RefCell protected query_stack is an issue (as far as I can tell). Storing query_stack as an immutable, linked list would do the trick as well. That does however necessitate that each element is stored behind an Arc or equivalent so it would imply some overhead. Potentially the links in the list could be references on the stack, but at least async code would need a 'static value so tokio::spawn etc would work.

@nikomatsakis
Copy link
Member

@Marwes ok. I feel like I've had branches where I started rewriting the tracking in that way...but I never landed them, obviously. I doubt it would be much overhead.

@Marwes
Copy link
Contributor Author

Marwes commented Jan 17, 2020

I believe that the Runtime would at least need to be moved outside of the "Database", or at least LocalState would need to be moved out so that queries may append without needing to go through a snapshot equivalent.

@Marwes Marwes closed this Jan 17, 2020
@Marwes Marwes reopened this Jan 17, 2020
@Marwes Marwes closed this Jul 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants