[POC] Prototype async query functions #164

Marwes · 2019-04-30T22:37:00Z

Wanted to check how much code would need to be added/duplicated to
support async queries. This isn't a a full implementation (I haven't
even tested to call async functions) but the normal sync implementation
still work with this with what shouldn't be too much overhead.

Not really interested in this until async/await gets to stable but
figured a PR with this could still be useful.

nikomatsakis · 2019-05-05T15:19:00Z

I've been wondering for some time whether salsa should be based on async fn. I decided to start with just threads but I guess as async fn approaches stable this starts to be more plausible.

Marwes · 2019-05-06T08:13:35Z

Using plain sync functions is likely enough for most uses but I would really like to use async throughout the compiler to allow procedural macros to be async.

Arguably async queries does make cancellation cleaner since it could be done as just not continuing to execute the future (for a query to check if its result is still needed it only needs to yield control).

This experiment makes me quite confident both can supported with only a little overhead for the sync case however.

nikomatsakis · 2019-06-26T10:09:33Z

Thanks for doing this @Marwes -- I'm wondering how much nicer it would be once rust-lang/rust#61775 lands (which should remove the need for lifetime parameters).

Marwes · 2019-06-26T10:20:54Z

Since it implements trait methods it needs to return a Box<Future + 'a> in most cases I don't think that would help much?

nikomatsakis · 2019-08-15T11:04:36Z

@Marwes ah, ok, perhaps not.

Marwes · 2019-10-13T20:05:43Z

Rebased this again. If async queries is something that is wanted I will try and get something decent done in time to the async/await release.

Did some changes from the first implementation so that this actually has a test that exercises an async query.
The async query -> sync query layer requires an Mutex + Condvar on every query which isn't necessary. Should be fixable by parameterizing the query by a sync/async channel type.
There aren't any error checking for async input queries which don't really make sense.
Async transparent queries should be easy to add.
Currently only non-send futures are returned which isn't workable in a non toy example. Might need some unsafe here to get a BoxFuture which is Send/!Send depending on the key/database.

Marwes · 2019-10-14T20:40:20Z

So there is an issue the LocalState type not being Sync. Since & references to the database are passed around this causes any future that contains such a reference to be !Send.

If I remember correctly, this &/&mut encoding in the database is "just" to prevent mutation to the inputs to happen during a revision? Is there any way that can be preserved without needing internal mutability in the LocalState so that this !Send issue can be fixed?

fn assert_send<T: Send>(t: T) -> T {
    t
}

async fn function(_: &AsyncDatabase) {}

#[test]
fn test_send() {
    assert_send(function(&AsyncDatabase::default()));
}

   Compiling salsa v0.13.0 (C:\Users\Markus\Dropbox\Programming\salsa)
error[E0277]: `std::cell::RefCell<std::vec::Vec<salsa::runtime::ActiveQuery<AsyncDatabase>>>` cannot be shared between threads safely
  --> tests\async.rs:50:5
   |
42 | fn assert_send<T: Send>(t: T) -> T {
   |    -----------    ---- required by this bound in `assert_send`
...
50 |     assert_send(function(&AsyncDatabase::default()));
   |     ^^^^^^^^^^^ `std::cell::RefCell<std::vec::Vec<salsa::runtime::ActiveQuery<AsyncDatabase>>>` cannot be shared between threads safely
   |
   = help: within `AsyncDatabase`, the trait `std::marker::Sync` is not implemented for `std::cell::RefCell<std::vec::Vec<salsa::runtime::ActiveQuery<AsyncDatabase>>>`
   = note: required because it appears within the type `salsa::runtime::local_state::LocalState<AsyncDatabase>`
   = note: required because it appears within the type `salsa::runtime::Runtime<AsyncDatabase>`
   = note: required because it appears within the type `AsyncDatabase`
   = note: required because of the requirements on the impl of `std::marker::Send` for `&AsyncDatabase`
   = note: required because it appears within the type `[static generator@tests\async.rs:46:38: 46:40 __arg0:&AsyncDatabase {}]`
   = note: required because it appears within the type `std::future::GenFuture<[static generator@tests\async.rs:46:38: 46:40 __arg0:&AsyncDatabase {}]>`
   = note: required because it appears within the type `impl core::future::future::Future`
   = note: required because it appears within the type `impl core::future::future::Future`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0277`.
error: could not compile `salsa`.

To learn more, run the command again with --verbose.
[Finished running. Exit status: 0]

frondeus · 2019-12-20T16:51:10Z

I also was thinking a little bit about async queries - and I think it could be nice to wait for structured concurrency.

See example: tokio-rs/tokio#1879

frondeus · 2019-12-20T16:55:25Z

About mutability - I think maybe we could be split methods into two traits Database and DatabaseMut.

Database would be used in queries - Getting inputs from storage, executing other queries etc. while DatabaseMut in the main thread only - setting inputs.

Marwes · 2019-12-20T21:29:34Z

I also was thinking a little bit about async queries - and I think it could be nice to wait for structured concurrency.

That might be nice since it could make it possible to ensure that sub-queries are done before the caller returns! Currently that requires an Arc + a runtime assertion in Drop impl to make a best effort check for it.

About mutability - I think maybe we could be split methods into two traits Database and DatabaseMut.

Yep, I am leaning towards that as well. It is unfortunate that it infects the current, sync queries as well but I suspect that they will need that anway if they are to support running sub-queries in parallel. (https://salsa.zulipchat.com/#narrow/stream/145099-general/topic/Sync.20database)

Wanted to check how much code would need to be added/duplicated to support async queries. This isn't a a full implementation (I haven't even tested to call async functions) but the normal sync implementation still work with this with what shouldn't be too much overhead. Not really interested in this until async/await gets to stable but figured a PR with this could still be useful.

frondeus · 2019-12-21T01:48:09Z

src/lib.rs

+    DB: Database,
+{
+    fn drop(&mut self) {
+        if !std::thread::panicking() {


What about https://doc.rust-lang.org/std/mem/fn.forget.html

Since there is no unsafe invariant to uphold forgetting the Forker just means that the query may end up in a bad state but there is no memory unsafety that is upheld by the drop.

Marwes · 2019-12-29T23:55:38Z

So the PR as is basically works but it is in a rough shape. Highlights/issues atm.

Send + Sync is a requirement now on keys/values. This is forced due to needing to box the returned futures in all the traits that salsa defines. I don't think this is a big issue when using async queries since you almost certainly want Send futures anyway but it ends up impacting sync queries as well and I had to remove tests or replace non-Sync types in them to get things working.

I would expect that non-Sync/Send types as keys and values to be fairly niche since any such queries can't be parallelized so maybe it isn't that big of a problem, not sure. It might be possible to restore non-Sync/Send for sync queries with some trickery (but no unsafe) or with much trickery and unsafe for both sync and async queries if it is desired however (I can write on that separately though).
The generated database trait now takes &mut self on all queries as the RefCell in the query local runtime were removed to achieve Send futures. All the actually mutating methods (set_input etc) were moved to a *Mut sibling trait which should ensure that those methods can't be called inside of a running query.

While taking &mut in queries is a bit less intuitive and perhaps a bit more error prone I do believe this can be an improvement for sync queries as well. While the RefCell does ensure that sub-queries can't be run concurrently with threads (which would break the local query stack), it does so in a clumsier way than a &mut reference. The &mut reference does still prevent concurrency, but it still allows the query to be run in a different (scoped) thread which is perfectly correct.

Further, even if the RefCell were kept and queries were forced to return non-Send futures, that still wouldn't be correct as it would be possible to invalidate the stack with future combinators such as join! (whereas &mut correctly prevents this).
Query forking has been added which lets multiple sub-queries be run concurrently as long as they all complete before the caller returns. This should work for parallel sync queries as well but I have only tested it with async.

The forking currently use an Arc to remove any need for lifetimes since running async queries on a thread poll (tokio::spawn etc) require 'static. For sync queries it may be better to offer an alternative which just borrows as it could be used to statically ensure that all queries run an complete before the fork-scope ends instead of doing this check dynamically via Drop.
The cycle detection is now in an awkward place as it now needs to check that neither the current runtime id, nor any parent runtime ids are in a cycle. It works best I can tell but I think this should use a real graph (https://crates.io/crates/petgraph ?) + cycle detection now instead. More or less fixed. Forks add additional edges to the subqueries and the cycle detector recurses down each of the sub-queries.
I had to contort the code quite a bit to ensure that locks aren't observed by the borrow checker across .awaits. Might be able this somewhat if async aware locks were used instead but there is currently no futures aware RwLock as far as I know.
futures based channels are used, even in sync queries. This could be fixed another abstraction so that the query specifies the channel type depending on if it is async or not. The CondVar based mini-executor should be possible to simplify such that it can just panic on wakes through that as well.
Some types are probably exposed more than they shouldn't be due to the aforementioned contortions

I could use some feedback at this stage if/when there is some time. The fork + cycle check changes could plausibly be extracted and worked out separately and could solve #80 .

@nikomatsakis
@matklad

nikomatsakis · 2020-01-13T14:28:13Z

Heh, @Marwes, I am always so negligent in providing feedback to your PRs. Thanks for the helpful summary. I'll try to give this a more detailed work. Some of the notes sounded mildly worrisome to me, but I have to look more deeply at the code to form a real opinion I guess.

Marwes · 2020-01-13T14:36:56Z

The cycle detection has been fixed (updated the notes) at least.

nikomatsakis · 2020-01-14T14:09:27Z

I haven't had time for a detailed look, but thinking more about it I am quite nervous about using &mut self for query methods. That introduces quite a break with the compiler's query system, for one thing, and I still have some hopes of bringing salsa plus the compiler together, and I think it's going to be quite inconvenient. Experience in rustc suggests that it is very useful to be able to thread the "context" around quite freely, rather than having to track it linearly all the time.

I guess one alternative is to use a proper lock? Or perhaps a async-aware lock?

Marwes · 2020-01-14T14:20:52Z

For async it is not enough with a lock, quoting myself.

Further, even if the RefCell were kept and queries were forced to return non-Send futures, that still wouldn't be correct as it would be possible to invalidate the stack with future combinators such as join! (whereas &mut correctly prevents this).

For synchronous code, a !Sync bound is enough to prevent concurrent uses. But that is not enough for asynchronous code.

But, if &mut is a bad idea for sync queries it is likely possible to have an API where only async requires &mut. Easiest way would be to just switch the RefCell to a Mutex while still having async queries requires &mut but we can probably do something better than that.

Marwes · 2020-01-14T14:25:51Z

Another, alternate API may be to make Runtime cheaply "cloneable". Then code that requires & can just pass the database around by & and call queries like db.clone().query(123). That may have some unacceptable overhead however.

nikomatsakis · 2020-01-15T17:40:24Z

@Marwes

OK, I see, maybe I hadn't deeply through the implications. You're saying that if you do

let x = db.some_query();

you don't necessarily await the query right then -- hence something like let x = join!(db.some_query(), db.some_other_query()).await could be a problem?

This is a problem specifically for our internal stack tracking, I suppose, which is quite stateful... It's not (at least not obviously...) a problem from a more "theoretical" point of view, is it?

I guess we have to be able to deal with the future being dropped, but that seems no different than generally being panic safe.

Marwes · 2020-01-15T19:14:28Z

hence something like let x = join!(db.some_query(), db.some_other_query()).await could be a problem?

Precisely.

This is a problem specifically for our internal stack tracking, I suppose, which is quite stateful... It's not (at least not obviously...) a problem from a more "theoretical" point of view, is it?

Nope. Only the RefCell protected query_stack is an issue (as far as I can tell). Storing query_stack as an immutable, linked list would do the trick as well. That does however necessitate that each element is stored behind an Arc or equivalent so it would imply some overhead. Potentially the links in the list could be references on the stack, but at least async code would need a 'static value so tokio::spawn etc would work.

nikomatsakis · 2020-01-16T15:19:14Z

@Marwes ok. I feel like I've had branches where I started rewriting the tracking in that way...but I never landed them, obviously. I doubt it would be much overhead.

Marwes · 2020-01-17T21:32:32Z

I believe that the Runtime would at least need to be moved outside of the "Database", or at least LocalState would need to be moved out so that queries may append without needing to go through a snapshot equivalent.

Marwes force-pushed the async branch 2 times, most recently from c3766f7 to d1b6867 Compare October 13, 2019 19:47

Marwes added 15 commits December 20, 2019 22:39

channel

b394932

Working again

5281de0

feat: async fn can be used to specify async queries

231e57e

Skip an Arc in sync_future

bea8eae

rustfmt

9722213

Allow async, transparent queries

5687df1

mut

07fa6c0

Add peek to queries

09e14d3

Add fork

66e2746

Use futures 0.3

578392d

Split mutation queries into their own trait

2121632

Remove unnecessary Fork type

92b17c7

Send

961327f

Fix async test

fa80e8e

Marwes force-pushed the async branch from b40211b to 92b17c7 Compare December 20, 2019 21:40

Marwes added 2 commits December 20, 2019 22:56

Hide the Arc in ForkState

06248f0

Make test dbs Send + Sync

88abaa8

Marwes added 2 commits December 20, 2019 23:55

Make tests work with non-send/sync

6ae17a6

Add some docs

14de3b5

frondeus reviewed Dec 21, 2019

View reviewed changes

perf: Avoid boxing a future for sync queries

ef2d7a5

Only use oneshot channels to signal completion

3fddf72

Marwes closed this Jan 17, 2020

Marwes reopened this Jan 17, 2020

Marwes closed this Jul 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[POC] Prototype async query functions #164

[POC] Prototype async query functions #164

Marwes commented Apr 30, 2019

nikomatsakis commented May 5, 2019

Marwes commented May 6, 2019 •

edited

Loading

nikomatsakis commented Jun 26, 2019

Marwes commented Jun 26, 2019

nikomatsakis commented Aug 15, 2019

Marwes commented Oct 13, 2019 •

edited

Loading

Marwes commented Oct 14, 2019

frondeus commented Dec 20, 2019

frondeus commented Dec 20, 2019

Marwes commented Dec 20, 2019

frondeus Dec 21, 2019

Marwes Dec 21, 2019

Marwes commented Dec 29, 2019 •

edited

Loading

nikomatsakis commented Jan 13, 2020

Marwes commented Jan 13, 2020

nikomatsakis commented Jan 14, 2020

Marwes commented Jan 14, 2020

Marwes commented Jan 14, 2020

nikomatsakis commented Jan 15, 2020

Marwes commented Jan 15, 2020

nikomatsakis commented Jan 16, 2020

Marwes commented Jan 17, 2020

[POC] Prototype async query functions #164

[POC] Prototype async query functions #164

Conversation

Marwes commented Apr 30, 2019

nikomatsakis commented May 5, 2019

Marwes commented May 6, 2019 • edited Loading

nikomatsakis commented Jun 26, 2019

Marwes commented Jun 26, 2019

nikomatsakis commented Aug 15, 2019

Marwes commented Oct 13, 2019 • edited Loading

Marwes commented Oct 14, 2019

frondeus commented Dec 20, 2019

frondeus commented Dec 20, 2019

Marwes commented Dec 20, 2019

frondeus Dec 21, 2019

Choose a reason for hiding this comment

Marwes Dec 21, 2019

Choose a reason for hiding this comment

Marwes commented Dec 29, 2019 • edited Loading

nikomatsakis commented Jan 13, 2020

Marwes commented Jan 13, 2020

nikomatsakis commented Jan 14, 2020

Marwes commented Jan 14, 2020

Marwes commented Jan 14, 2020

nikomatsakis commented Jan 15, 2020

Marwes commented Jan 15, 2020

nikomatsakis commented Jan 16, 2020

Marwes commented Jan 17, 2020

Marwes commented May 6, 2019 •

edited

Loading

Marwes commented Oct 13, 2019 •

edited

Loading

Marwes commented Dec 29, 2019 •

edited

Loading