Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global analyzer config precedence #48634

Closed
chsienki opened this issue Oct 15, 2020 · 9 comments · Fixed by #49834
Closed

Global analyzer config precedence #48634

chsienki opened this issue Oct 15, 2020 · 9 comments · Fixed by #49834

Comments

@chsienki
Copy link
Member

chsienki commented Oct 15, 2020

Global analyzer config precedence

We recently introduced a new feature, known as 'global analyzer config'. This was designed to solve two, somewhat orthogonal, problems:

  1. The ability to participate in the .editorconfig hierarchy, without the file itself being within the source hierarchy (such as from a NuGet package)
  2. The ability to provide diagnostic settings to diagnostics or files that do not have a source path

The design generally satisfies both problems, but runs into issue when you combine both problems together and still wish to have hierarchical config settings.

Issue

This is easiest explained by way of example. Consider the following scenarios:

Scenario 1

  • A NuGet package ships a global analyzer config that sets the severity of <ID> to warning.
  • The user of the package wishes to downgrade this warning to none.

Solution: The user adds a regular .editorconfig that changes the severity of <ID> to none within their source tree.

Scenario 2

  • A user wants to set the severity of <ID> to none in source generated files.

Solution: The user adds a global analyzer config to their source directory, which sets the severity of <ID> to none.

Scenario 3

  • A NuGet package ships a global analyzer config that sets the severity of <ID> to warning.
  • The user of the package wishes to set the severity of <ID> to none in source generated files.

Problem: The user cannot create a global analyzer config with <ID> set to none. Because global configs are combined into a single virtual config, we unset the value and issue a warning. The user has no way to override the value from the NuGet package.

Proposed Solution

The proposed solution is to introduce an optional notion of precedence to global analyzer configs.

A straw-man syntax looks something like

is_global = true
global_level = 1

Any colliding keys within the same level will be unset and a warning issued. When keys from a higher level collide with those of a lower level, the higher level wins, and if there is only a single winning key, no warning is issued. By default, a global editor config without an explicit level would have a default level of 0.

Thus in scenario 3, a user could create a global config with a higher precedence than the one specified in the NuGet package, allowing them to override the severity of <ID> in generated files.

Multiple keys with the same value

The existing behavior today is to issue a warning whenever we see duplicate keys. However, it seems intuitive that if all the keys map to the same value, we should be able to treat it as unambiguous and use the value. As part of this proposal we should change the behavior seen today to allow unambiguous duplicates. This won't be a breaking change as we'll be removing a subset of warnings.

Why isn't this a problem for .editorconfig

Standard .editorconfigs can resolve conflicts via their natural filesystem hierarchy. Anything 'further from the root' is higher in precedence than the closer ones. Global analyzer configs can exist in completely different hierarchies, so a NuGet provided config in c:\users\user\.nuget\cache\MyPackage\1.0.0\.globalconfig could be 'further away' than c:\myproject\.globalconfig but should clearly not take precedence.

Further, because global configs are designed to provide values for files without a source path, even if we could resolve the 'logical depth' of the config, there is no correct way to determine which one should apply to the file itself. One option could be to use the project file path as a proxy, but the compiler itself doesn't currently have any notion of project file. Therefore not only would it be an only somewhat inaccurate heuristic, but we would have to do work anyway to enable it.

Notes

Should we allow negative levels?: Ultimately the level just needs to be an opaque sortable key. Limiting it to integers makes sense as the ordering is obvious, but having negative levels wouldn't necessarily be a problem. A user could create a config with a level of -1 which would mean 'Apply this config key, unless a NuGet package has already provided it'. This seems like kind of a niche case, but falls out for free from the above design, so seems worth including.

Could a 'malicious' NuGet package provide an ID of Int32.Max?: Yes, but this is no worse a situation than exists currently.

Can it get complicated figuring out where a value is set: Yes, with multiple hierarchical configs and levels it could be non-obvious exactly which rule is being applied and why. One option could be to issue a hidden diagnostic when we select an override, that would allow e.g. an analyzer to surface the information to the user.

Is this just a problem for NuGet: No, for sufficiently advanced solutions (such as Roslyn itself) it seems likely that different parts of the source tree (such as a Compilers and Workspaces) might want to have differing global values. This proposal enables that too.

@chsienki chsienki self-assigned this Oct 15, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged Issues and PRs which have not yet been triaged by a lead label Oct 15, 2020
@chsienki
Copy link
Member Author

chsienki commented Oct 15, 2020

//cc @mavasani @jasonmalinowski @jaredpar for review / feedback.

@mavasani
Copy link
Contributor

mavasani commented Oct 15, 2020

Thanks @chsienki - the problem is indeed real and I agree it needs a solution. The proposal of using a global_level to enable conflict detection seems reasonable. However, I am wondering if we can make this slightly simpler for end users and avoid them having any special keys in their hierarchical global config files named .globalconfig. Let me elaborate.

We currently recommend end users to name their hierarchical global config files as .globalconfig, so they are implicitly detected by project system, very similar to .editorconfig files. We can make use of this and go a step further by defaulting global_level values for global config files as follows:

  1. Default global_level = int.MaxValue for global config files named .globalconfig and default it to global_level = 0 for rest of the global config files.
  2. Resolve conflicts between entries with same key in two global config files as follows:
    1. If conflicting files have different global_level, prefer the entry from file with higher global_level
    2. Otherwise, if both conflicting files are named .globalconfig and have global_level = int.MaxValue, and one global config is in a child directory of the other, then prefer the .globalconfig which is deeper in file system (i.e. has a longer path). This will add support to allow hierarchical overriding in hierarchical global configs, similar to editorconfigs.
    3. Otherwise, consider it as an unresolvable conflict and report a warning, just like we today

As an aside, I think we should also implicitly assume is_global = true for global config files named .globalconfig to further reduce clutter in end-user provided hierarchical global config files.

We should also consider updating the documentation about global config naming to recommend following naming for global config files:

  1. End users should name their global config files as .globalconfig
  2. NuGet packages should name their global config files as <%Package_Name%>.globalconfig
  3. MSBuild tooling generated global config files should be named as <%Target_Name%>_Generated.globalconfig or some such.

@jaredpar jaredpar removed the untriaged Issues and PRs which have not yet been triaged by a lead label Oct 15, 2020
@chsienki
Copy link
Member Author

Interesting. I'm not against the idea of making more decisions about levels implicitly, but I think we have to be very careful in designing the rules around how we do it. In the original design, we went for is_global as we didn't want to make assumptions in the compiler based on filename. I'd be willing to relax this, but would want @jaredpar to chime in on his thoughts around this.

In terms of assuming int.MaxValue that basically only gives us two levels. At that point, I would lean more towards simply saying .globalconfig wins out over *.globalconfig as it would essentially be the same. I would instead prefer to give .globalconfig a default level of something like 100 instead. That way there is still plenty of room to override via this mechanism, and leaves 100 or so levels for NuGet packages to play around with too (like if one child package wanted to override the value from a parent that it knew was going to be there).

I'm also wary of the hierarchy based decision making, just from a user understanding point of view. Imagine the following layout:

\project.csproj
\.globalconfig
\src\subfolder\.globalconfig
\src\subfolder\file2.cs

Any generated files would get the values set in \src\subfolder\.globalconfig by default. While that would still be true if you went in and set the global_level to a higher value, it at least requires the user to actively opt-in to it, especially the first time they add a conflicting value: they'd get a warning, and have to edit the file, as opposed to it just silently overriding it.

Like I said I'm not against making it more implicit, but there is an obvious tension between ease of use and accidental opt-in that I'd like to be very careful with.

I guess I wonder if any of the implicit suggestions become breaking changes. If not I wonder if it makes sense to start with everything being explicit, then add implicit options as users request them / encounter issues?

@mavasani
Copy link
Contributor

I'm also wary of the hierarchy based decision making, just from a user understanding point of view

Sounds good. We can leave that part out for now, it is more than likely that users will have a single .globalconfig at repo root, so this feature might not be that essential after all.

I wonder if any of the implicit suggestions become breaking changes. If not I wonder if it makes sense to start with everything being explicit, then add implicit options as users request them / encounter issues?

My core concern is users will not really understand the concept of global_level, and would wonder why they have to specify an entry with an opaque integral value to override settings coming from external files. It is fine if they require to explicitly specify it for some subtle cases, but requiring it to be always specified might end up being more of a hindrance and confusion point rather than an aid.

In the original design, we went for is_global as we didn't want to make assumptions in the compiler based on filename.

Note that this was a feedback from our doc expert, @gewarren, when we were writing up the documentation for global config files and I agree with her. The fact that we require is_global = true entry despite the fact that the file is named .globalconfig, which is implicitly detected by project system and passed to the compiler as a global config seems completely redundant. Given now we require another entry (global_level) to distinguish between user provided and NuGet package provided global config, makes me feel that we should give the name .globalconfig special status similar to the name .editorconfig and make lives of end users easier.

@mavasani mavasani changed the title Global editor config precedence Global analyzer config precedence Oct 16, 2020
@chsienki
Copy link
Member Author

implicitly detected by project system and passed to the compiler as a global config

So we do detect it via the project system, but note that the compiler doesn't have any notion of what is or isn't a global config until we actually parse it. We look for the is_global flag specifically because we cant distinguish between regular and global configs because they go through the same commandline argument.

In an ideal world we'd just use a different argument, but as discussed before this isn't practical. @jaredpar to review this, because we're really creating workaround's here for this specific problem. If we absolutely can't revisit this decision, then I would argue in favor of using the filename as a signal so its more obvious for a user.

I guess I'm still a little concerned that we'd require NuGet authors to call their config file something different (and include the is_global = true) but at least that's a much smaller subset of users.

@mavasani
Copy link
Contributor

FYI @chsienki - We have now received 3 separate customer requests for this feature. The .NET SDK analyzers ship with global configs which are conflicting with user's global configs. We need this feature to resolve the precedence.

@chsienki
Copy link
Member Author

@mavasani yup, let's get this prioritized. I'll get a PR up with the above design and we can make any iterations there as needed.

mavasani added a commit to mavasani/roslyn-analyzers that referenced this issue Dec 11, 2020
Implements category specific AnalysisLevel properties as per dotnet/roslyn#49250 (comment).

Users would now be able to specify the core AnalysisLevel and override category specific AnalysisLevel for each category.

```xml
<AnalysisLevel>5-minimum</AnalysisLevel>
<AnalysisLevelSecurity>latest-all</ AnalysisLevelSecurity >
<AnalysisLevelPerformance>preview-recommended</AnalysisLevelPerformance>
<AnalysisLevelGlobalization>5-none</AnalysisLevelGlobalization>
```

1. The post-build tool now generates category-specific global configs for each supported analysis mode
2. The tool also generates category-specific `AddGlobalAnalyzerConfigForPackage` targets so that if user overrides a category-specific AnalysisLevel property

NOTE: This PR depends on feature dotnet/roslyn#48634, which is currently in code review: dotnet/roslyn#49834
channeladam added a commit to channeladam/ChannelAdam.CSharp.Snippets that referenced this issue Jan 24, 2021
@reduckted
Copy link

It looks like PR #49834 will solve my specific scenario, but since my scenario is a little bit different to the other scenarios mentioned here, I thought I'd describe it.

At my company we use an internal NuGet package to keep our build settings consistent across the many repositories and solutions that we have. The NuGet package contains a .props file that sets certain MSBuild properties to achieve this.

We would like to start with all analyzer rules enabled, and only disable a few that we don't want to use. To do this, we set AnalysisMode to AllEnabledByDefault in the .props file and use a .globalconfig file that's included in the NuGet package to disable the few rules that we don't to use. So we have a .props file that looks like this:

<Project>
    <PropertyGroup>
        <AnalysisMode>AllEnabledByDefault</AnalysisMode>
    </PropertyGroup>

   <ItemGroup>
        <GlobalAnalyzerConfigFiles Include="$(MSBuildThisFileDirectory).globalconfig" />
    </ItemGroup>
</Project>

Unfortunately, that results in conflicts because using AllEnabledByDefault will add a config to GlobalAnalyzerConfigFiles that enables all rules, and then the custom .globalconfig file that is included in the NuGet package disables some of those rules.

Once a config level can be specified, then we can set the level of the .globalconfig file in the NuGet package to be higher and that will prevent the conflicts. 🎉

@mavasani
Copy link
Contributor

mavasani commented Apr 9, 2021

This was fixed with #49834

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants