Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand ruff.configuration to allow inline config #16296

Merged
merged 7 commits into from
Feb 26, 2025

Conversation

dhruvmanila
Copy link
Member

@dhruvmanila dhruvmanila commented Feb 21, 2025

Summary

Internal design document

This PR expands ruff.configuration to allow inline configuration directly in the editor. For example:

{
	"ruff.configuration": {
		"line-length": 100,
		"lint": {
			"unfixable": ["F401"],
			"flake8-tidy-imports": {
				"banned-api": {
					"typing.TypedDict": {
						"msg": "Use `typing_extensions.TypedDict` instead"
					}
				}
			}
		},
		"format": {
			"quote-style": "single"
		}
	}
}

This means that now ruff.configuration accepts either a path to configuration file or the raw config itself. It's mostly similar to --config with one difference that's highlighted in the following section. So, it can be said that the format of ruff.configuration when provided the config map is same as the one on the playground 1.

Limitations

Casing (kebab-case v/s/ camelCase)

The config keys needs to be in kebab-case instead of camelCase which is being used for other settings in the editor.

This could be a bit confusing. For example, the line-length option can be set directly via an editor setting or can be configured via ruff.configuration:

{
	"ruff.configuration": {
        "line-length": 100
    },
    "ruff.lineLength": 120
}

Possible solution

We could use feature flag with conditional compilation to indicate that when used in ruff_server, we need the Options fields to be renamed as camelCase while for other crates it needs to be renamed as kebab-case. But, this might not work very easily because it will require wrapping the Options struct and create two structs in which we'll have to add #[cfg_attr(...)] because otherwise serde will complain:

error: duplicate serde attribute `rename_all`
  --> crates/ruff_workspace/src/options.rs:43:38
   |
43 | #[cfg_attr(feature = "editor", serde(rename_all = "camelCase"))]
   |                                      ^^^^^^^^^^

Nesting (flat v/s nested keys)

This is the major difference between --config flag on the command-line v/s ruff.configuration and it makes it such that ruff.configuration has same value format as playground 1.

The config keys needs to be split up into keys which can result in nested structure instead of flat structure:

So, the following won't work:

{
	"ruff.configuration": {
		"format.quote-style": "single",
		"lint.flake8-tidy-imports.banned-api.\"typing.TypedDict\".msg": "Use `typing_extensions.TypedDict` instead"
	}
}

But, instead it would need to be split up like the following:

{
	"ruff.configuration": {
		"format": {
			"quote-style": "single"
		},
		"lint": {
			"flake8-tidy-imports": {
				"banned-api": {
					"typing.TypedDict": {
						"msg": "Use `typing_extensions.TypedDict` instead"
					}
				}
			}
		}
	}
}

Possible solution (1)

The way we could solve this and make it same as --config would be to add a manual logic of converting the JSON map into an equivalent TOML string which would be then parsed into Options.

So, the following JSON map:

{ "lint.flake8-tidy-imports": { "banned-api": {"\"typing.TypedDict\".msg": "Use typing_extensions.TypedDict instead"}}}

would need to be converted into the following TOML string:

lint.flake8-tidy-imports = { banned-api = { "typing.TypedDict".msg = "Use typing_extensions.TypedDict instead" } }

by recursively convering "key": value into key = value which is to remove the quotes from key and replacing : with =.

Possible solution (2)

Another would be to just accept Map<String, String> strictly and convert it into key = value and then parse it as a TOML string. This would also match --config but quotes might become a nuisance because JSON only allows double quotes and so it'll require escaping any inner quotes or use single quotes.

Test Plan

VS Code

Requires astral-sh/ruff-vscode#702

settings.json:

{
  "ruff.lint.extendSelect": ["TID"],
  "ruff.configuration": {
    "line-length": 50,
    "format": {
      "quote-style": "single"
    },
    "lint": {
      "unfixable": ["F401"],
      "flake8-tidy-imports": {
        "banned-api": {
          "typing.TypedDict": {
            "msg": "Use `typing_extensions.TypedDict` instead"
          }
        }
      }
    }
  }
}

Following video showcases me doing the following:

  1. Check diagnostics that it includes TID
  2. Run Ruff: Fix all auto-fixable problems to test unfixable
  3. Run Format: Document to test line-length and quote-style
Screen.Recording.2025-02-24.at.11.08.13.AM.mov

Neovim

init.lua:

require('lspconfig').ruff.setup {
  init_options = {
    settings = {
      lint = {
        extendSelect = { 'TID' },
      },
      configuration = {
        ['line-length'] = 50,
        format = {
          ['quote-style'] = 'single',
        },
        lint = {
          unfixable = { 'F401' },
          ['flake8-tidy-imports'] = {
            ['banned-api'] = {
              ['typing.TypedDict'] = {
                msg = 'Use typing_extensions.TypedDict instead',
              },
            },
          },
        },
      },
    },
  },
}

Same steps as in the VS Code test:

Screen.Recording.2025-02-24.at.11.19.43.AM.mov

Documentation Preview

Screen.Recording.2025-02-26.at.10.15.55.AM.mov

Footnotes

  1. This has one advantage that the value can be copy-pasted directly into the playground 2

@dhruvmanila dhruvmanila added the server Related to the LSP server label Feb 21, 2025
Copy link
Contributor

github-actions bot commented Feb 21, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@dhruvmanila dhruvmanila changed the title Expand ruff.configuration to allow all config Expand ruff.configuration to allow inline config Feb 21, 2025
@dhruvmanila dhruvmanila force-pushed the dhruv/server-configuration branch from e8f2911 to df9fbdc Compare February 21, 2025 09:49
@InSyncWithFoo
Copy link
Contributor

Does this mean the following existing settings will be deprecated?

@dhruvmanila dhruvmanila force-pushed the dhruv/server-configuration branch from df9fbdc to 4b26a73 Compare February 24, 2025 04:48
@dhruvmanila
Copy link
Member Author

dhruvmanila commented Feb 24, 2025

Does this mean the following existing settings will be deprecated?

No, I don't think so we'll deprecate them. It's similar to how there's ruff format --line-length 100 . but the same can be set via ruff format --config='line-length = 100' .

Comment on lines 359 to 373
settings.configuration.as_ref().and_then(|configuration| {
match ResolvedConfiguration::try_from(configuration) {
Ok(configuration) => Some(configuration),
Err(err) => {
tracing::error!("Failed to resolve configuration: {err}");
None
}
}
})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm planning to open a follow-up PR to notify the user that resolving the client settings failed. We currently don't log or notify the user if it fails. For example, lint.select = ["RUF07", "I001"] will see that RUF07 fails and thus ignores this config value which means that I001 is not selected either.

@dhruvmanila
Copy link
Member Author

I need to update the documentation but I'll wait to first make sure that this is the approach that we want to take. I'd appreciate getting an initial feedback for the approach, I've documented the limitations along with potential solutions that could be implemented to resolve them.

@dhruvmanila dhruvmanila marked this pull request as ready for review February 24, 2025 06:16
@MichaReiser
Copy link
Member

MichaReiser commented Feb 24, 2025

I'll reply to some of the design questions before taking a look at the code itself.

This means that now ruff.configuration accepts either a path to configuration file or the raw config itself. It's mostly similar to --config with one difference that's highlighted in the following section.

One consequence of this is that it won't be possible to specify both a configuration file and individual configuration options, which is different from the CLI. I mainly want to call this out but I think this is an okay limitation because we could always introduce a configuration-file option and deprecate configuration: <path> in favor of that option.

The casing difference is interesting. I don't think we should change the casing of configuration options because they're then different from what we list in the documentation. Again, I think this limitation is fine.

Another would be to just accept Map<String, String> strictly and convert it into key = value and then parse it as a TOML string. This would also match --config but quotes might become a nuisance because JSON only allows double quotes and so it'll require escaping any inner quotes or use single quotes.

I'm not sure I fully understand this limitation. I believe all field and table identifiers have to be string. It's only the values that can be numbers, strings, or entire tables. Could we, therefore, deserialize the settings to Map<String, Value> where Value is any acceptable TOML value? Doing so would allow writing "format": { "line-length": 100 } but also "format.line-length": 100

@dhruvmanila
Copy link
Member Author

One consequence of this is that it won't be possible to specify both a configuration file and individual configuration options, which is different from the CLI. I mainly want to call this out but I think this is an okay limitation because we could always introduce a configuration-file option and deprecate configuration: <path> in favor of that option.

Yes, I'm aware of this limitation and as you mention, it wouldn't be too much of an effort to add configurationFile option.

The casing difference is interesting. I don't think we should change the casing of configuration options because they're then different from what we list in the documentation. Again, I think this limitation is fine.

👍

I'm not sure I fully understand this limitation. I believe all field and table identifiers have to be string. It's only the values that can be numbers, strings, or entire tables. Could we, therefore, deserialize the settings to Map<String, Value> where Value is any acceptable TOML value? Doing so would allow writing "format": { "line-length": 100 } but also "format.line-length": 100

That won't work. So, currently this is what happens:

InitializeParams (JSON string) 
  -> Deserialize into json::Map<String, json::Value> (ClientSettings)
    -> Deserialize into toml Table
      -> Deserialize into Options (ResolvedClientSettings)

The JSON string in your example would be:

"{\"format.quote-style\": \"single\"}"

which gets deserialized into a JSON map:

{"format.quote-style": "single"}

Remember that in JSON all keys are strings.

Then, we create a TOML table and deserialize it into Options:

toml::Table::try_from(map)?.try_into::<Options>()?;

Here, the format.quote-style is still interpreted as format.quote-style and not format = {quote-style = ... } which then fails stating that there's no key like format.quote-style in Options.

@MichaReiser
Copy link
Member

Then, we create a TOML table and deserialize it into Options:

Yeah, I think we would have to do the same as on the CLI where we parse the setting name as well. Simply deserializing into Options wont do

@MichaReiser
Copy link
Member

I took a look at Ruff's CLI and it seems to directly deserialize into the Table. I still think that it should be possible, but it may require some post/pre processing on our side. I'd have look closer into how TOML deserializes tables (which are thin wrappers around HashMap<String. Value> but it seems to me that it should be possible to automatically do the string unwrapping for the key.

@dhruvmanila
Copy link
Member Author

Yeah, I think we would have to do the same as on the CLI where we parse the setting name as well. Simply deserializing into Options wont do

There's a small difference between the CLI and what we have here. The command-line value if already a TOML string which means it can just use toml::Table::from_str but here we have a JSON string.

What you're describing is then what I've laid down with "Nesting > Possible solution (1)" section where we'll have to create this TOML string ourselves from the JSON value. Does that make sense? I know it's a bit confusing 😅

@dhruvmanila
Copy link
Member Author

(Oops, I didn't see your comment)

I still think that it should be possible, but it may require some post/pre processing on our side. I'd have look closer into how TOML deserializes tables (which are thin wrappers around HashMap<String. Value> but it seems to me that it should be possible to automatically do the string unwrapping for the key.

Fair enough. I tried a few things but that didn't work, I can time box it and look into how the library deserializes tables.

@MichaReiser
Copy link
Member

Okay. I played a bit with toml and the toml-edit crates and I now understand the problem. The issue is that format.line-length needs to deserialize to a nested hash map. I feel like it should be possible somehow to drive the toml Table deserializer to make this work but it's not obvious how. I don't think it's worth investing more time into this and using JSON is fine then.

Do you want me to review the code as well or did you mainly want to get feedback on the config format?

@dhruvmanila
Copy link
Member Author

Okay. I played a bit with toml and the toml-edit crates and I now understand the problem. The issue is that format.line-length needs to deserialize to a nested hash map. I feel like it should be possible somehow to drive the toml Table deserializer to make this work but it's not obvious how. I don't think it's worth investing more time into this and using JSON is fine then.

Thank you for spending time looking into it.

I also think this could possibly be done in the future if we get user feedback on this specific format issue because it would still be backwards compatible in that both nested and flat keys would be supported but right now it's only nested keys.

Do you want me to review the code as well or did you mainly want to get feedback on the config format?

I mainly wanted to get feedback on the format. I'm planning to work on the documentation part tomorrow morning. I think it be would best to defer the review for tomorrow to save review cycle.

@dhruvmanila dhruvmanila removed the request for review from MichaReiser February 25, 2025 04:28
@dhruvmanila dhruvmanila force-pushed the dhruv/server-configuration branch from 4b26a73 to db5a5e4 Compare February 25, 2025 06:52
@dhruvmanila dhruvmanila force-pushed the dhruv/server-configuration branch from db5a5e4 to cda5a52 Compare February 25, 2025 06:52
@MichaReiser
Copy link
Member

I hope you don't mind if I put this back into draft. It makes it easier for me to see when I'm supposed to review it.

@MichaReiser MichaReiser marked this pull request as draft February 25, 2025 07:39
@dhruvmanila dhruvmanila marked this pull request as ready for review February 25, 2025 08:00
Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This looks great. I've mainly some documentation and error handling suggestions.

pub(crate) enum ResolvedConfigurationError {
#[error(transparent)]
EnvVarLookupError(#[from] shellexpand::LookupError<std::env::VarError>),
#[error("error serializing configuration to TOML: {0}")]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me where we serialize the value to TOML. I understand deserialization but I don't see where we perform any serialization.

I worry that this error message is also not very helpful for users because it's not clear what's going wrong. Can we use a more specific error message? Failed to load configuration from ruff.configurations:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The serialization of map to TOML happens here (try_from):

let options = toml::Table::try_from(map)?.try_into::<Options>()?;

I'll update the error message

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to keep this message for now. The "Failed to load ..." message that you've suggested is already going to be appended before this. The reason I want to keep this is that for something like:

  "ruff.configuration": {
    "line-length": null
  },

We'll get:

Failed to load settings from `configuration`: unsupported unit type

But, with the above message:

Failed to load settings from `configuration`: error serializing configuration to TOML: unsupported unit type

We'll atleast know that the issue is with TOML deserialization

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failed to load settings from configuration: error serializing configuration to TOML: unsupported unit type

I don't think I would understand this message or even know what I have to do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think one solution would be to handle these null cases because that's the only case where I'm able to get this serialization error. I checked the spec as well (https://toml.io/en/v1.0.0#keys) but can't get the InvalidToml error

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could skip the intermediate JSON deserialization but that would yield to even worse experience:

error: Failed to deserialize initialization options: data did not match any variant of untagged enum InitializationOptions. Falling back to default client settings...

@MichaReiser
Copy link
Member

One noteworthy drawback of having a single option for configuration is that setting the path is no longer possible from the Settings UI.

@dhruvmanila
Copy link
Member Author

One noteworthy drawback of having a single option for configuration is that setting the path is no longer possible from the Settings UI.

Yeah, that's a good point. Maybe it might be worth adding the configurationFile option along with this change. Let me think about it.

I was also wondering if something like ruff.experimental is worth using such that we can first add ruff.experimental.configuration to allow initial feedback and then later merge it. It does raise questions on when can we promote a feature out of experimental phase because it's going to be tied up with the Ruff version. It might just be too complicated and not worth it.

@MichaReiser
Copy link
Member

MichaReiser commented Feb 25, 2025

Yeah, that's a good point. Maybe it might be worth adding the configurationFile option along with this change. Let me think about it.

I'm fine deferring it for now. I just thought it worth calling out. I expect that many users will want to use the inline config anyway

@dhruvmanila dhruvmanila merged commit be03cb0 into main Feb 26, 2025
21 checks passed
@dhruvmanila dhruvmanila deleted the dhruv/server-configuration branch February 26, 2025 04:47
dhruvmanila added a commit to astral-sh/ruff-vscode that referenced this pull request Feb 26, 2025
## Summary

Closes: #6 

This PR adds support for astral-sh/ruff#16296
for the VS Code extension by allowing any arbitrary object in
`ruff.configuration`

Additionally, it will provide a warning message to the user if they're
using inline configuration but the Ruff version does not support it.

It has one disadvantage that the user will see two popups where the
error one is coming from the server because it cannot deserialize the
options. We could not show the warning popup if this is too much. I'm
worried that it might go unnoticed because the "Show Logs" in the below
popup will open the server logs while the message would be in the client
logs.

<img width="491" alt="Screenshot 2025-02-25 at 1 24 15 PM"
src="https://github.com/user-attachments/assets/8bafbd69-f8fa-4604-8ab3-1f6efa745045"
/>

We'll still log it on the console (client log channel):
```
2025-02-25 13:24:10.067 [warning] Inline configuration support was added in Ruff 0.9.8 (current version is 0.9.7). Please update your Ruff version to use this feature.
```

I haven't provided more details in the warning message based on the
assumption that if the user is using inline configuration then they're
aware of it but it's just that the Ruff version is too old.

## Test Plan

Refer to the test plan in astral-sh/ruff#16296
dcreager added a commit that referenced this pull request Feb 27, 2025
* main:
  [red-knot] unify LoopState and saved_break_states (#16406)
  [`pylint`] Also reports `case np.nan`/`case math.nan` (`PLW0177`) (#16378)
  [FURB156] Do not consider docstring(s) (#16391)
  Use `is_none_or` in `stdlib-module-shadowing` (#16402)
  [red-knot] Upgrade salsa to include `AtomicPtr` perf improvement (#16398)
  [red-knot] Fix file watching for new non-project files (#16395)
  document MSRV policy (#16384)
  [red-knot] fix non-callable reporting for unions (#16387)
  bump MSRV to 1.83 (#16294)
  Avoid unnecessary info at non-trace server log level (#16389)
  Expand `ruff.configuration` to allow inline config (#16296)
  Start detecting version-related syntax errors in the parser (#16090)
dcreager added a commit that referenced this pull request Feb 27, 2025
* dcreager/dont-have-a-cow:
  [red-knot] unify LoopState and saved_break_states (#16406)
  [`pylint`] Also reports `case np.nan`/`case math.nan` (`PLW0177`) (#16378)
  [FURB156] Do not consider docstring(s) (#16391)
  Use `is_none_or` in `stdlib-module-shadowing` (#16402)
  [red-knot] Upgrade salsa to include `AtomicPtr` perf improvement (#16398)
  [red-knot] Fix file watching for new non-project files (#16395)
  document MSRV policy (#16384)
  [red-knot] fix non-callable reporting for unions (#16387)
  bump MSRV to 1.83 (#16294)
  Avoid unnecessary info at non-trace server log level (#16389)
  Expand `ruff.configuration` to allow inline config (#16296)
  Start detecting version-related syntax errors in the parser (#16090)
dhruvmanila added a commit that referenced this pull request Feb 27, 2025
## Summary

As mentioned in
#16296 (comment)

This PR updates the client settings resolver to notify the user if there
are any errors in the config using a very basic approach. In addition,
each error related to specific settings are logged.

This isn't the best approach because it can log the same message
multiple times when both workspace and global settings are provided and
they both are the same. This is the case for a single workspace VS Code
instance.

I do have some ideas on how to improve this and will explore them during
my free time (low priority):
* Avoid resolving the global settings multiple times as they're static
* Include the source of the setting (workspace or global?)
* Maybe use a struct (`ResolvedClientSettings` +
`Vec<ClientSettingsResolverError>`) instead to make unit testing easier

## Test Plan

Using:
```jsonc
{
  "ruff.logLevel": "debug",
	
  // Invalid settings
  "ruff.configuration": "$RANDOM",
  "ruff.lint.select": ["RUF000", "I001"],
  "ruff.lint.extendSelect": ["B001", "B002"],
  "ruff.lint.ignore": ["I999", "F401"]
}
```

The error logs:
```
2025-02-27 12:30:04.318736000 ERROR Failed to load settings from `configuration`: error looking key 'RANDOM' up: environment variable not found
2025-02-27 12:30:04.319196000 ERROR Failed to load settings from `configuration`: error looking key 'RANDOM' up: environment variable not found
2025-02-27 12:30:04.320549000 ERROR Unknown rule selectors found in `lint.select`: ["RUF000"]
2025-02-27 12:30:04.320669000 ERROR Unknown rule selectors found in `lint.extendSelect`: ["B001"]
2025-02-27 12:30:04.320764000 ERROR Unknown rule selectors found in `lint.ignore`: ["I999"]
```

Notification preview:

<img width="470" alt="Screenshot 2025-02-27 at 12 29 06 PM"
src="https://github.com/user-attachments/assets/61f41d5c-2558-46b3-a1ed-82114fd8ec22"
/>
dcreager added a commit that referenced this pull request Feb 27, 2025
* main:
  [red-knot] unify LoopState and saved_break_states (#16406)
  [`pylint`] Also reports `case np.nan`/`case math.nan` (`PLW0177`) (#16378)
  [FURB156] Do not consider docstring(s) (#16391)
  Use `is_none_or` in `stdlib-module-shadowing` (#16402)
  [red-knot] Upgrade salsa to include `AtomicPtr` perf improvement (#16398)
  [red-knot] Fix file watching for new non-project files (#16395)
  document MSRV policy (#16384)
  [red-knot] fix non-callable reporting for unions (#16387)
  bump MSRV to 1.83 (#16294)
  Avoid unnecessary info at non-trace server log level (#16389)
  Expand `ruff.configuration` to allow inline config (#16296)
  Start detecting version-related syntax errors in the parser (#16090)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
server Related to the LSP server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants