Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deeper rust-analyzer Integration with Non-Cargo Build Systems #13446

Open
davidbarsky opened this issue Oct 20, 2022 · 39 comments
Open

Deeper rust-analyzer Integration with Non-Cargo Build Systems #13446

davidbarsky opened this issue Oct 20, 2022 · 39 comments
Labels
A-rust-project rust-project.json related issues C-enhancement Category: enhancement

Comments

@davidbarsky
Copy link
Contributor

This issue connects several disparate issues and discussions, such as #13437, #13128, #12105, #13226, and several others that I'm probably missing. This issue will be a bit handy-wavy, but I think there's value in describing my ideal user experience, which is to make rust-analyzer work with Buck/Bazel/et. al. as just as well as if they were Cargo.

The Current State

My current employer has one of the largest monorepos in the world and uses the Buck build systemto manage the vast majority of Rust builds. At the moment, most users connect to a remote dev server via VS Code and have their VS Code workspace be at the root of the monorepo (these are largely immutable user behavior and expectations). They then run a small CLI that spits out a rust-project.json (e.g., rust-project develop //foo/bar/my-target) and work on their project from the root of the workspace. To get cargo check-style diagnostics, people need to manually invoke Buck with a check-flavored build as such: buck build :my-target[check].

Possible Improvements

In no particular order, since I see these as roughly equivalent in impact:

  • After VS Code has activated the rust-analyzer extension (for instance, after the user has opened a Rust file), rust-analyzer-the-language-server should be able to delegate to an external tool (in my case, rust-project) that can query the build system for the owning TARGETs, generate a rust-project.json, and continue with the remainder of the rust-project.json-based workspace loading. This option would occur after rust-analyzer checks for a Cargo.toml or a rust-project.json, doesn't find them, and a "discover workspace" (name to be bikeshed!) command is defined.
  • The flycheck crate should be configurable to run builds/checks through a non-Cargo-based build system. When I last looked at the flycheck crate, it didn't seem like a particularly difficult change, just one that nobody has gotten around to designing or implementing.
  • All of the above features can be configured through a rust-analyzer.toml (which I don't believe has an issue), which can be placed at the root of the monorepo, which would reduce how many configuration forks I need to make for the rust-analyzer extension.
@bjorn3
Copy link
Member

bjorn3 commented Oct 20, 2022

The flycheck crate should be configurable to run builds/checks through a non-Cargo-based build system. When I last looked at the flycheck crate, it didn't seem like a particularly difficult change, just one that nobody has gotten around to designing or implementing.

Does buck allow passing --error-format=json to rustc and forwarding the resulting json messages? Flycheck needs this as it can only parse error messages in rustc's json format. The default user facing error message format of rustc is unstable, non-trivial to parse and I believe misses some information that rust-analyzer needs.

@davidbarsky
Copy link
Contributor Author

Does buck allow passing --error-format=json to rustc and forwarding the resulting json messages? Flycheck needs this as it can only parse error messages in rustc's json format. The default user facing error message format of rustc is unstable, non-trivial to parse and I believe misses some information that rust-analyzer needs.

It does. At the moment, IDE-specific diagnostics are written to a deterministic folder in the output directory, and while I think that I'd need to convince the Buck team to revisit their stance on writing to, e.g., stderr, I feel that it's something that's pretty tractable and I could succeed on.

@Veykril
Copy link
Member

Veykril commented Oct 20, 2022

The flycheck crate should be configurable to run builds/checks through a non-Cargo-based build system. When I last looked at the flycheck crate, it didn't seem like a particularly difficult change, just one that nobody has gotten around to designing or implementing.

What exactly do you mean here? You can already fully overwrite the command that is being executed by flycheck (the flycheck code just mentions cargo everywhere still even though its really just some command with a specific json output at this point)

Ye I believe we don't have an issue for that yet, but we probably want to have such a config at some point

@davidbarsky
Copy link
Contributor Author

davidbarsky commented Oct 20, 2022

What exactly do you mean here? You can already fully overwrite the command that is being executed by flycheck (the flycheck code just mentions cargo everywhere still even though its really just some command with a specific json output at this point)

I stand corrected! I just assumed... that it didn't work. I'm seeing rust-analyzer invoke buck correctly when I point rust-analyzer to the right target and have a shells script invoke buck and read out the JSON diagnostics from rustc. The main issue is that it requires a bit of manual configuration. Would y'all accept a PR that adds a field to rust-project.json that overrides rust-analyzer.checkOnSave.overrideCommand (as the tool generating the rust-project.json has the best knowledge of what command to run to get diagnostics)?

Ye I believe we don't have an issue for that yet, but we probably want to have such a config at some point

I'm guessing that this refers to the currently hypothetical rust-analyzer.toml?

@Veykril
Copy link
Member

Veykril commented Oct 21, 2022

Would y'all accept a PR that adds a field to rust-project.json that overrides rust-analyzer.checkOnSave.overrideCommand (as the tool generating the rust-project.json has the best knowledge of what command to run to get diagnostics)?

I feel like it would make more sense to support that setting in a rust-analyzer.toml?

I'm guessing that this refers to the currently hypothetical rust-analyzer.toml?

Yes sorry, I failed to copy the quoted text :)

@googleson78
Copy link

googleson78 commented Oct 21, 2022

To chime in, because I'm working on the exact same thing (as you already linked)

cargo check

Being able to checkOnSave.overrideCommand is great and somewhat works, but we need to have finer granularity on running it, because right now it's a static command, and we can't specify "please only build this one single file and its dependencies". In bigger repos, rebuilding everything is not viable, and having to wait 30 minutes to build because a PR unrelated to you changed something entirely unrelated to you sucks.

"Run tests"

The code lenses that pop up saying Run tests currently don't get disabled even if you use rust-project.json, even though they will still try to use cargo. They should get disabled. However ideally it would be really good if we could set what command to run instead of cargo for the tests! If the "current file" variable interpolation is also implemented, I imagine that that would be good enough to provdie specific enough information to the external tool about what tests to run.

I'll write a separate issue for this.

Config

Having a specific file for this config would be great. To be honest I don't like having the configuration passed in from the LSP client, because then every dev has to manually do that. I'm not sure if this is good design however.

@davidbarsky
Copy link
Contributor Author

davidbarsky commented Oct 21, 2022

I feel like it would make more sense to support that setting in a rust-analyzer.toml?

@Veykril My feeling is that it makes sense to support this feature in both places—I don't think I can share exact numbers, but at least for my employer, it's not feasible to have a single rust-analyzer.toml that defines a single build command for the entire repository. Unlike something like cargo check or cargo build, which is able to be used across projects and is able to infer what package is being built by looking at the present working directory, Buck requires specifying the exact package ("target", in Buck's parlance) to be built. It's as if Cargo required every build to be specified with cargo build -p some-package. For cultural and code organization reasons, it also doesn't really make sense to place a rust-analyzer.toml alongside each target that wants IDE support—Buck supports target trees that are arbitrarily nested (unlike Cargo workspaces, which are typically only a single level deep), and people expect to navigate across those targets and continue to receive IDE support.

For Buck users, rust-project.json would ideally become a hidden implementation detail that is entirely controlled/mediated by either Buck or tooling that integrates with Buck. Since, at the moment, the user knows what Buck target they're intending to work on (and are running the rust-project CLI accordingly), I think it makes the make sense to colocate the build command alongside the project graph. I hope that provided sufficient detail, and if not, I'm happy to elaborate any point.


To put what I said in a different, higher-level way: roughly speaking, rust-analyzer assumes a model where one VS Code workspace corresponds to a single editable/buildable project (this is how I work in the monorepo, but it's becoming clear that I'm the exception). A lot of folks who expect monorepo-oriented tooling have a mental model of one VS Code workspace corresponding to many, many, editable/buildable projects, only one or two of which the user interested in working on. For better or worse, rust-analyzer needs a bit of help knowing which node in this very, very, wide tree it needs to analyze and build-on-save, and a rust-project.json seems like the natural place to put that information.

(I assume that everything I said about Buck also applies to Bazel, but I've never used Bazel.)

@googleson78
Copy link

@davidbarsky Yup, it's pretty much the same. We discussed this a bit and a potential solution in #13437

@Veykril
Copy link
Member

Veykril commented Oct 21, 2022

Yes you are right, I was mainly thinking of checkOnSave = diagnostics related config, but really it integrates like all the other build related configs, so having that be part of the rust-project.json makes sense still, at least for the override command.

@Veykril
Copy link
Member

Veykril commented Oct 22, 2022

@davidbarsky, Okay I got one more question regarding the rust-project.json stuff in regards to overrideCommand here. Would it not be feasible for you to generate a rust-analyzer.toml along side your rust-project.json generation?

@davidbarsky
Copy link
Contributor Author

@Veykril While it's not unreasonable to generate a rust-project.json alongside a rust-analyzer.toml Buck's equivalent of cargo metadata (buck cquery) returns absolute paths and, therefore, can't be committed since it includes people's usernames. The generated rust-project.json is also often 30-40k lines of JSON and the source control team prefers that the file wasn't committed. The build command for Buck that includes the target being built is also sensitive to the location from which the Buck is being invoked.

I think it'd be neat if the Buck integration tooling can do all of this automatically, in the background, by placing the generated rust-project.json in an Eden-managed tmpdir and communicated to rust-analyzer where to look, but that's a little bit further away.

To step back a little, I view the rust-project.json as ephemeral and per-user, whereas I see the rust-analyzer.toml as global to all rust-analyzer users in the monorepo. I'm most interested in a rust-analyzer.toml because it'd allow me to reduce how many configuration forks I need to make in vending rust-analyzer. At the moment, it's not a lot, but I'd like to be able to get it down to zero. That being said, I don't see why there can't be Cargo-style hierarchal configs with rust-project.json—I know some teams would want to set their own settings.


When trying to add support for loading the build command to rust-analyzer yesterday, I found I could make the change to load the rust-project.json in the main loop when rust-analyzer is checking for linked paths. It's not the cleanest change, but it'd avoid the need to change Arc<Config> to Arc<Mutex<Config>> or something similar.

@Veykril
Copy link
Member

Veykril commented Oct 22, 2022

While it's not unreasonable to generate a rust-project.json alongside a rust-analyzer.toml Buck's equivalent of cargo metadata (buck cquery) returns absolute paths and, therefore, can't be committed since it includes people's usernames.

Would you also need to specify absolute paths for the overrideCommands? And I was thinking of generating it without committing it (whats the point of having it generated by the user otherwise).

That being said, I don't see why there can't be Cargo-style hierarchal configs with rust-project.json—I know some teams would want to set their own settings.

I don't think I'd want any config to be specifiable in the rust-project.json, that's not the purpose of that file. It's purpose is to specify the structure of a project without using cargo. That's where rust-analyzer.toml should come into play imo.

@bjorn3
Copy link
Member

bjorn3 commented Oct 22, 2022

overrideCommand makes sense to me in rust-project.json as it defines how to build the project and if you have multiple projects each one is expected to have a different overrideCommand. Most other configs make sense to be specified at a user or vscode workspace level, but overrideCommand only really makes sense at a per-project level. Even when using cargo you may want to enable a cargo feature for one project (as it needs it for your purposes), but not for another project (as it doesn't have this feature at all).

@Veykril
Copy link
Member

Veykril commented Oct 22, 2022

Yes, in general I agree that it makes sense. I mainly wanted to confirm whether rust-analyzer.toml could make sense here as well or not, as I also associate that with project setups.

I also just realized

I don't think I'd want any config to be specifiable in the rust-project.json,

can be read differently than intended, with any I did not mean I do not want any config there at all, but not just any random config. Build specific ones do make sense (which means checkOnSave.overrideCommand and buildScripts, but anything else doesn't really seem to make too much sense (I also just realized that buildScripts having the cargo prefix seems like a bad choice, we probably want to alias that one).

@davidbarsky
Copy link
Contributor Author

Would you also need to specify absolute paths for the overrideCommands? And I was thinking of generating it without committing it (whats the point of having it generated by the user otherwise).

I'm not sure, but I'm leaning towards a "no" from necessity standpoint but "yes" for ease of supporting and debugging people's environments. And yes, agreed on the generation of the rust-project.json.

As for the second point about hierarchal rust-project.jsons—that was a typo, I meant to refer to rust-analyzer.toml instead, sorry about that!

@Veykril
Copy link
Member

Veykril commented Nov 2, 2022

I opened #13529 for a discussion regarding a rust-analyzer.toml

@Veykril Veykril added the A-rust-project rust-project.json related issues label Nov 2, 2022
@googleson78
Copy link

googleson78 commented Nov 10, 2022

Getting back to this, I think that working around the core difference between r-a's assumption ("we have many small Cargo.tomls unified by a cargo workspace") and the way rust-project.json works (one global rust-project.json for all the different packages) will not be sustainable long term. It feels like one of those situations where the secondary solution will always be left behind in terms of support and more effort will be spend thinking of ways to work around the differences. Maybe I'm just very uninformed about the internals of r-a though, so this might not be a correct conclusion.

Would it not be possible instead to

  1. Modify rust-analyzer.json structure (or more realistically create a new format) so that there can be many rust-project.jsons ala Cargo.tomls which then get handled by r-a in the same way in which multiple Cargo.tomls are handled
  2. Update generation downstream (in i.e. in the buck and bazel rulesets for rust) so that they output many rust-analyzer.jsons in the corresponding directories for the different packages.

?

@bjorn3
Copy link
Member

bjorn3 commented Nov 10, 2022

rust-project.json is equivalent to a cargo workspace. Internally rust-analyzer effectively produces an in memory equivalent to rust-project.json based on running cargo metadata in the workspace root. I think the build system you use should produce a rust-project.json file with all crates. If you are in a mono repo, it should generate a rust-project.json file with the crate you are working on and all of it's dependencies rather than all crates. What would be the benefit of splitting it into many files if rust-analyzer would merge it into a single big file again? Rust-analyzer can't just decide to not load part of it on it's own as not loading some crates would break find all references for all references in the crates that aren't loaded.

@googleson78
Copy link

Ah, I see, thanks! Then it makes sense that there's not much point in splitting it back up again.

it should generate a rust-project.json file with the crate you are working on and all of it's dependencies rather than all crates.

This doesn't sound viable to me - afaiu, it means that whenever I jump across to a different (non-dependent) crate, a file would have to be regenerated(already big overhead here), which at least now, takes a non-trivial amount of time for rules_rust 🤔 . I have not done any analysis to say how much this can be sped up, but in order for the IDE experience to be good, it would have to be something like <1s, which sounds hard to do.

In the first place, what's the thing that allows r-a to know which crate to run e.g. cargo check for when working with Cargo.tomls, that is also missing from rust-project.json?

@bjorn3
Copy link
Member

bjorn3 commented Nov 10, 2022

afaiu, it means that whenever I jump across to a different (non-dependent) crate, a file would have to be regenerated(already big overhead here)

I meant that you tell the build system which project inside the mono repo you are working on and then the build system generates a rust-project.json file containing just what you need.

In the first place, what's the thing that allows r-a to know which crate to run e.g. cargo check for when working with Cargo.tomls, that is also missing from rust-project.json?

Rust-analyzer searches for all Cargo.toml files up to 1 level deep (could be more levels, but this isn't done because of performance concerns in large projects).

pub fn discover_all(paths: &[AbsPathBuf]) -> Vec<ProjectManifest> {
let mut res = paths
.iter()
.filter_map(|it| ProjectManifest::discover(it.as_ref()).ok())
.flatten()
.collect::<FxHashSet<_>>()
.into_iter()
.collect::<Vec<_>>();
res.sort();
res
}
}
It then runs cargo metadata on each one and considers it a separate workspace. I am quite certain there is code to prevent every workspace member from being considered a separate workspace, but I can't quickly find it. In any case rust-analyzer simply runs cargo check in the root of every workspace it knows about, just like for rust-project.json. #13336 will add an option to limit cargo check to just the crate that you changed. This does not limit rust-analyzer itself to just that crate. Rust-analyzer always loads everything and any change to the set of crates requires a full reload.

@davidbarsky
Copy link
Contributor Author

Since this issue got some additional activity today, I wanted to re-raise an idea for rust-analyzer that I mentioned in the issue:

After VS Code has activated the rust-analyzer extension (for instance, after the user has opened a Rust file), rust-analyzer-the-language-server should be able to delegate to an external tool (in my case, rust-project) that can query the build system for the owning TARGETs, generate a rust-project.json, and continue with the remainder of the rust-project.json-based workspace loading. This option would occur after rust-analyzer checks for a Cargo.toml or a rust-project.json, doesn't find them, and a "discover workspace" (name to be bikeshed!) command is defined.

Does any one have any objections to such an approach? I think I'd like to tackle this sooner than I otherwise expected since one of the most common bits of feedback I've received from users is that they don't like having to manually run rust-project because:

  • they want to open a Rust file and have completions/IDE functionality just work. It's really common to open VS Code from the Phabricator review tool or the code browser and the extra step of running rust-project on the file they just opened to just get rust-analyzer working is one that people tend (and would like to!) to avoid.
  • People would prefer not to care about which directory the rust-project.json is located in. This has been the biggest support burden for me, by far! While portions of this can be solved by changing the depth at which rust-analyzer searches for the Cargo.toml or rust-project.json, it'd be nice to sidestep that issue entirely.

@Veykril
Copy link
Member

Veykril commented Nov 11, 2022

Ye, I think we can go with that and see how it goes. On paper it makes sense to have something like this I think (with the future addition of a rust-analyzer.toml being able to define this command as well, at least in my head that makes sense to have it there as well)

@googleson78
Copy link

Just to be on the same page, isn't your use case for buck with a big rust monorepo which is all under the same WORKSPACE with many packages (corresponding to a cargo workspace with many cargo packages)? And then, wouldn't you want to have only one rust-project.json (because rust.project.json == cargo workspace), avoiding needing to regenerate rust-project.json when you jump between packages which aren't in your current dependency set (e.g. you've opened X and jump to Y which X doesn't depend on).

Not that it's directly related to your use case, but for rules_rust right now, it can't quickly generate a rust-project.json for the entire WORKSPACE, some building needs to be done first, and even if all the building is done, the processing still takes quite a while more than what would be acceptable (~2mins from a fully built repo). How fast is your rust-project tool and is it publicly available somewhere?

@davidbarsky
Copy link
Contributor Author

davidbarsky commented Nov 15, 2022

Ye, I think we can go with that and see how it goes. On paper it makes sense to have something like this I think (with the future addition of a rust-analyzer.toml being able to define this command as well, at least in my head that makes sense to have it there as well)

Cool! I'll try to get to this next week.

Just to be on the same page, isn't your use case for buck with a big rust monorepo which is all under the same WORKSPACE with many packages (corresponding to a cargo workspace with many cargo packages)?

@googleson78 sorry for the delay. Buck doesn't do things quite like how Bazel does it: there's no big WORKSPACE file, just arbitrarily many TARGETS in subdirectories. Therefore, I can't really get away with just one rust-project.json for the entire monorepo, since there isn't a single WORKSPACE definition. Even if there were a single workspace definition, it'd take too long to reasonably query everything and load into rust-analyzer, and I'm not really keen on caching pre-loaded rust-analyzer instances for a variety of reasons.

Not that it's directly related to your use case, but for rules_rust right now, it can't quickly generate a rust-project.json for the entire WORKSPACE, some building needs to be done first, and even if all the building is done, the processing still takes quite a while more than what would be acceptable (~2mins from a fully built repo).

rust-project is largely two buck2 cquerys under the hood, plus some JSON transformations (from what I can tell, bazel cquery is largely equivalent). It doesn't really handle code generation or proc macros at the moment, but I'm working on adding that this week.

How fast is your rust-project tool and is it publicly available somewhere?

When it's not writing to a virtual file system (e.g., EdenFS), rust-project is able to complete in about 2-3 seconds: it's generally pretty fast. When it's writing to EdenFS, it's about 10-13 seconds, which is not ideal. It's not currently open source, but I don't have any objections to potentially open-sourcing it. The code's a bit gross, though, so I'd like to clean that up first.

Having thought about this implementation, I thought of a few questions pertaining to Rust Analyzer for @Veykril, though:

  1. How reasonable it is for rust-analyzer to allow mutating values like Crate::is_workspace_member after the workspace is loaded (in other words, does rust-analyzer's incrementality extend to crates?) I ask this because i can imagine a person opening a Rust file, having a rust-project.json be generated for them, and then editing a Rust file in a neighboring crate. It'd be great if we can avoid spinning up a new instance of rust-analyzer with all the re-indexing that entails and instead reuse the existing rust-analyzer instance.
    1. There are some crates that shouldn't be modified (e.g., crates sourced from crates.io, generated code, etc.), but I guess defining what crates really shouldn't be edited can be defined through a blocklist in rust-analyzer.toml or the UI can affirmatively prompt the user to mark the crate as being part of the workspace.
  2. Is it possible to have something equivalent to a future for some crates in a rust-project.json? The context of this question is to try and load non-proc macro, non-generated code (e.g., any code that requires invoking rustc or a build tool) into Rust Analyzer first, but as Rust Analyzer is performing analysis, have the build system generate the remaining code, and load it in after. I have no idea how to implement this, but I figured it'd be worth asking.
  3. I remember @matklad saying that they wanted to revisit the design/structure of rust-project.json, but felt it was too difficult to change with the existing users (for what it's worth, it's very easy for me to migrate!). Do any of y'all have any ideas as what those changes might be/what things y'all are unhappy about with rust-project.json?

(Sorry if I @'d anyone in appropriately: I'm not entirely sure what the norms are for rust-analyzer, and I'm certainly not close to enough to @ y'all freely. Please let me know if I should be more conservative with my usage of mentions!)

@bjorn3
Copy link
Member

bjorn3 commented Nov 15, 2022

How reasonable it is for rust-analyzer to allow mutating values like Crate::is_workspace_member after the workspace is loaded (in other words, does rust-analyzer's incrementality extend to crates?)

The project layout is an input of the query/caching system. Changing is_workspace_member necessarily requires changing this input. Reloading the project structure is implemented as setting a new value for this input too: https://github.com/rust-lang/rust-analyzer/blob/0dd0dfb7dfe723bbfa81197c23d73c9bdc4022c1/crates/base-db/src/change.rs#LL74C79-L74C79 So changing is_workspace_member after loading is almost equivalent to reloading as far as I understand. Especially for is_workspace_member as it affects a lot of things.

Is it possible to have something equivalent to a future for some crates in a rust-project.json? The context of this question is to try and load non-proc macro, non-generated code (e.g., any code that requires invoking rustc or a build tool) into Rust Analyzer first, but as Rust Analyzer is performing analysis, have the build system generate the remaining code, and load it in after. I have no idea how to implement this, but I figured it'd be worth asking.

That again is pretty much equivalent to reloading as it requires invalidating the macro expansion result and everything derived from this.

@flodiebold
Copy link
Member

flodiebold commented Nov 15, 2022

The project layout is an input of the query/caching system. Changing is_workspace_member necessarily requires changing this input. Reloading the project structure is implemented as setting a new value for this input too: https://github.com/rust-lang/rust-analyzer/blob/0dd0dfb7dfe723bbfa81197c23d73c9bdc4022c1/crates/base-db/src/change.rs#LL74C79-L74C79 So changing is_workspace_member after loading is almost equivalent to reloading as far as I understand. Especially for is_workspace_member as it affects a lot of things.

That's true, but RA's analysis is incremental across these reloads. Small changes in the crate graph should not cause a lot of recalculation as long as the crate IDs stay the same. I'm not sure changing is_workspace_member would affect the analysis much at all; it's mostly used for IDE features, I think.

(I don't really understand why you would want to change is_workspace_member anyway, though. I'd expect all crates within the workspace that are actually listed in the generated rust-project.json to be marked as workspace members, not just the one the user originally wanted to edit.)

@Veykril
Copy link
Member

Veykril commented Nov 15, 2022

2. Is it possible to have something equivalent to a future for some crates in a `rust-project.json`? The context of this question is to try and load non-proc macro, non-generated code (e.g., any code that requires invoking `rustc` or a build tool) into Rust Analyzer first, but as Rust Analyzer is performing analysis, have the build system generate the remaining code, and load it in after. I have no idea how to _implement_ this, but I figured it'd be worth asking.

R-a actually already does work while build scripts are still evaluating, it's just that once the build scripts are finished running /proc-macros are built, this again changes the workspace input (as now build script outputs are attached) retriggering some queries that have been invalidated.

@matklad
Copy link
Member

matklad commented Nov 15, 2022

Some high level comments:

checkOnSave:

architecturally, checkOnSave is a hack. “run the real build using the real build system and display errors” is an important feature to have, buts its really none of rust-analyzer’s business to arrange that. We do that for cargo just to make more of the things work out of the box for more users, but this isn’t how it all should work.

for monorepo case, what you want to do is to set checkOnSave=false in rust-analyzer, and than have a separate vscode plugin which watches open files and code modifications, schedules buck/bazel/pants/whatever builds and displays errors in the editor.

@matklad
Copy link
Member

matklad commented Nov 15, 2022

On rust-project.json file

For the monorepo case, there shouldn’t be project.json file. rust-analyzer the binary supports getting the project structure as JSON object in config sent as an LSP message. So, for monorepo, what should happen is that every time a user opens a new project, rust analyzer should receive an LSP message with a new config embedded.

I think one approach one company used here is to have an extra proxy on the wire between vscode extension and rust-analyzer binary, with the proxy injecting configs. This is a bit roundabout. We should add some explicit hook to vscode extension to allow injecting the config.

So, again, we have a monorepo-specific extension in VS Code which watches what user is doing and pokes rust-analyzer with an updated config when appropriate.

@matklad
Copy link
Member

matklad commented Nov 15, 2022

(I think an elegant way to make this happen would be for hypothetical vsbuck extension to just modify vscode workspace settings dynamically, but it looks like vscode lacks APIs for configuration providers https://stackoverflow.com/questions/72194332/need-a-better-way-to-programmatically-clear-and-set-user-settings-in-vscode)

(I think these days rust-analyzer still shows “you must reload extension when you change this setting”, but I think this is accidental? I believe on the server side all configs are dynamic these days)

@matklad
Copy link
Member

matklad commented Nov 15, 2022

I'm not sure changing is_workspace_member would affect the analysis much at all; it's mostly used for IDE features, I think.

It also controlled salsa durability. Non-workspace members are “durable”, which means ra optimizes for code there to not change often.

@matklad
Copy link
Member

matklad commented Nov 15, 2022

On dynamic world:

rust-analyzer makes a soft assumption that it knows all the crates there are. While we could incrementally discover new crates, we try to push users towards specyfing all crates up-front.

This is important for “find usages” and for all the reactors which build on top of that. It would suck if you do a rename, and then the build fails because you haven’t yet open some downstream crate, and usages there weren’t refactored.

I think this is a right assumption for basically everything hosted on GitHub, but is wrong for giant monorepos. In particular, in monorepos find usages should be handled by some map reduce index and not by ra, and one does not ide-rename identifier across the whole monorepo.

Luckily, this is a weak assumption, and everything should basically just work if you dynamically add new crates to the crate graphs. If you are careful with id assignment, this should also be very incremental, though I expect that we might be doing some needless recomputations here, as we didn’t really optimized this. (Eg, crate graph is atomic query, I think we need to add firewall q for “deps of crate X” to avoid re-resolving std once you add a crate)

@davidbarsky
Copy link
Contributor Author

Gotcha, thanks all for responding.

bjorn3/florian:

That's true, but RA's analysis is incremental across these reloads. Small changes in the crate graph should not cause a lot of recalculation as long as the crate IDs stay the same. I'm not sure changing is_workspace_member would affect the analysis much at all; it's mostly used for IDE features, I think.

iirc, is_workspace_member allows for quicker load times by creating immutable syntax trees: when I made this change when generating rust-project.jsons at work, I've noticed that rust-analyzer is noticeably faster at loading the entire crate graph.

(I don't really understand why you would want to change is_workspace_member anyway, though. I'd expect all crates within the workspace that are actually listed in the generated rust-project.json to be marked as workspace members, not just the one the user originally wanted to edit.)

I mean, there are some rough heuristics that I can take, but the rust-project.json almost aways includes crates that were vendored from crates.io from a directory like third-party/rust or first-party, internal libraries (e.g., things pertaining to config, RPC middleware, logging, etc.). It's not common for most users to edit these libraries, but it is common for them to edit a crate that would be considered to be an immediate neighbor, as if it were a sibling in a Cargo workspace. Unfortunately, the line between "uncommonly edited core library" and "the code the user actually wants to edit" is a bit blurry, since there isn't anything like a Cargo workspace enforcing clear boundaries.


Lukas:

R-a actually already does work while build scripts are still evaluating, it's just that once the build scripts are finished running /proc-macros are built, this again changes the workspace input (as now build script outputs are attached) retriggering some queries that have been invalidated.

that's good to know!


matklad:

architecturally, checkOnSave is a hack. “run the real build using the real build system and display errors” is an important feature to have, buts its really none of rust-analyzer’s business to arrange that. We do that for cargo just to make more of the things work out of the box for more users, but this isn’t how it all should work.

for monorepo case, what you want to do is to set checkOnSave=false in rust-analyzer, and than have a separate vscode plugin which watches open files and code modifications, schedules buck/bazel/pants/whatever builds and displays errors in the editor.

That's probably fair, but I'll need to look again at the Buck extension. For better or worse, the terminal-rendered output from rustc is much better than what can be rendered over JSON inside of VS Code, but I'll take a look.

For the monorepo case, there shouldn’t be project.json file. rust-analyzer the binary supports getting the project structure as JSON object in config sent as an LSP message. So, for monorepo, what should happen is that every time a user opens a new project, rust analyzer should receive an LSP message with a new config embedded.

My gut feeling is that this approach is the right one (and, indeed, is what I've setup for neovim users), but as you've mentioned in a subsequent comment, contribution points to VS Code extensions are weirdly limited. To get around this issue, my employer's forked VS Code and added shared memory between extensions to allow that sort of nudging. However, that feels gross, and I'd rather try to steer clear of that for maintenance reasons. If Microsoft added a contribution point for configuration, then that would be the ideal solution.

rust-analyzer makes a soft assumption that it knows all the crates there are. While we could incrementally discover new crates, we try to push users towards specyfing all crates up-front.

Yeah, I think that's the right call for rust-analyzer. When buck cquery is run, it knows the entire build graph of the target in question and it's able to provide a result pretty quickly (often times, it's on the order of several hundred milliseconds). The build graph incrementalism I'm curious about/interested in is not discovering new dependencies, but rather, communicating to rust-analyzer something along the lines of "I promise there will be a crate here to analyze in a few seconds"

This is important for “find usages” and for all the reactors which build on top of that. It would suck if you do a rename, and then the build fails because you haven’t yet open some downstream crate, and usages there weren’t refactored.

I absolutely agree! Thankfully, most of the crates that I'm thinking of are things like "generated RPC definitions", which shouldn't be refactored by rust-analyzer anyways: changes like that should be made in the code generator, not in the downstream consumer.

I think this is a right assumption for basically everything hosted on GitHub, but is wrong for giant monorepos. In particular, in monorepos find usages should be handled by some map reduce index and not by ra, and one does not ide-rename identifier across the whole monorepo.

I think that there are two different kinds of refactors:

  • The person doing a refactor is a library consumer, which means that the refactor is scoped to the target(s) that people are working on. I worry you're selling rust-analyzer short, since it works really well in this use-case. For instance, on the largest Rust project I could find at my employer—Hack, clocking in at 600-700k lines of Rust code—rust-analyzer is able to load the entire rust-project.json (on my admittedly beefy dev server) in about 13-14 seconds. This is the fastest language servers that is supported at my employer, by a mile.
  • The person doing a refactor is a library vendor, which means that they need to update all usage sites. I think a fancier approach like you're suggesting is probably the right call, but I'm somewhat curious how well rust-analyzer can handle this. I have a feeling that rust-analyzer can handle this more gracefully than you're selling it.

Luckily, this is a weak assumption, and everything should basically just work if you dynamically add new crates to the crate graphs. If you are careful with id assignment, this should also be very incremental, though I expect that we might be doing some needless recomputations here, as we didn’t really optimized this. (Eg, crate graph is atomic query, I think we need to add firewall q for “deps of crate X” to avoid re-resolving std once you add a crate)

Gotcha. Naively, it seems like inserting crates at the end of the collection, as opposed to in the middle, can provide the firewall that you're referring to by not allowing access to indices higher than N.

@matklad
Copy link
Member

matklad commented Nov 15, 2022

contribution points to VS Code extensions are weirdly limited.

rust-analyzer's vscode extension itself can provide a contribution point here. We already have an ad-hoc one: #5017. Though, that one is "ok as a temporary solution" :) A proper thing to do here would be to use

There's a huge design space for what we actually expose that way, and what stability guarantees we provide though. I think a reasonable start is "expose whatevere, but we provide zero guarantees" :D

@davidbarsky
Copy link
Contributor Author

rust-analyzer's vscode extension itself can provide a contribution point here

That's... extremely cool and I had no idea this was possible. I was looking through https://code.visualstudio.com/api/references/contribution-points and didn't see anything along these lines (except contributes.typescriptServerPlugins, which isn't exactly what I'm looking for 😅), but the existence of this might make a lot of seemingly really hard things a lot easier.

There's a huge design space for what we actually expose that way, and what stability guarantees we provide though. I think a reasonable start is "expose whatevere, but we provide zero guarantees" :D

Oh yeah, that's totally reasonable—I'm fine with whatever breaking changes that need to be made as this is sorted out—I'm able to control distribution, so at least for me, my users won't be impacted by breaking changes. I'm not entirely sure what should be exposed, but I can look into it. Looking at what the APIs allow exposing, it seems like it's possible for rust-analyzer's VS Code extension to try and call one of the APIs its exposed, and if it returns nothing (indicating no other extension imported it/extended it), then rust-analyzer can proceed normally.

@davidbarsky
Copy link
Contributor Author

davidbarsky commented Dec 22, 2022

Cool! I'll try to get to this next week.

It, uh, took a bit longer than a week, but I hacked together something that works 😅. I'll post a PR in a day or two.


Anyways, I'm not sure whether to post the following on this issue or on #13446, but given the discussion about extensions invoking build systems here, this issue will do. For context: I was thinking about to how add Buck1-specific flycheck support to rust-analyzer and realized that that this would also apply to Bazel.

Anyway! I didn't realize why @googleson78 asked for support for $saved_file interpolation in #13437, but after struggling getting automatically configured build commands working with Buck, I realized that $saved_file allows Buck/Bazel to determine the owning Target of the $saved_file and build it while keeping one global command (sorry, I didn't pick up on that detail!). However, I have an alternative proposal that might mesh better with some of the other goals I mentioned above: add an optional build_command to project_json::Crate. When a file is saved, rust-analyzer can determine which crate was modified and invoke the corresponding build command. This might require spawning a new flycheck for each save, but I think that will be pretty infrequent.

(The reason I'm proposing extending rust-analyzer, as opposed to doing this inside of a Buck extension, is that I think rendering rustc diagnostics inline is better handled by rust-analyzer than a Buck extension2. A Buck/Bazel extension might need to handle a potentially unbounded number of languages, whereas rust-analyzer largely needs to handle Cargo and Buck/Bazel pretending to be Cargo.)

Footnotes

  1. Buck2 has been quietly open-sourced and it might be of interest to readers here: https://github.com/facebookincubator/buck2. Please don't share too widely, but there are overlapping approaches between it and rust-analyzer!

  2. And, to be clear, I can be convinced otherwise, especially if it's an unreasonable maintenance burden! That being said, I think there's something nice about VS Code/Emacs/Vim all getting build system support from rust-analyzer, out of the box.

@googleson78
Copy link

The reason I'm proposing extending rust-analyzer, as opposed to doing this inside of a Buck extension, is that I think rendering rustc diagnostics inline is better handled by rust-analyzer than a Buck extension

Hm, but how would you even get a Buck extension to render diagnostics to a LSP client? Even with the $saved_file stuff, you still only save the output that rustc generates inside Bazel/Buck and subsequently pass that output to rust-analyzer, which is still responsible for rendering/passing it along to an LSP client. Am I missing something obvious?

@bjorn3
Copy link
Member

bjorn3 commented Dec 23, 2022

The buck extension can directly pass the diagnostics to vscode. There doesn't need to be an LSP implementation for that. In fact vscode itself doesn't have any lsp client at all, rather the extension corresponding with the lsp server has an lsp client translating between LSP and the vscode API's. You can use those same vscode API's directly in the buck extension.

@davidbarsky
Copy link
Contributor Author

@googleson78: yeah, what bjorn3 said. If there does need to extension-level coordination, then using the contribution points that matklad pointed out in #13446 (comment) can be another mechanism to get this functionality. However, for somewhat complicated reasons that are idiosyncratic to my employer, it's an approach that I'd rather stay away from (or at the very least, one that I'm a bit nervous about).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-rust-project rust-project.json related issues C-enhancement Category: enhancement
Projects
None yet
Development

No branches or pull requests

6 participants