Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ordered evaluation in FileSystemGlobbing Matcher #109408

Open
kasperk81 opened this issue Oct 31, 2024 · 6 comments
Open

Ordered evaluation in FileSystemGlobbing Matcher #109408

kasperk81 opened this issue Oct 31, 2024 · 6 comments
Labels
api-ready-for-review API is ready for review, it is NOT ready for implementation area-Extensions-FileSystem help wanted [up-for-grabs] Good issue for external contributors
Milestone

Comments

@kasperk81
Copy link
Contributor

kasperk81 commented Oct 31, 2024

Background and Motivation

The current Microsoft.Extensions.FileSystemGlobbing.Matcher does not evaluate include and exclude patterns in the order they are added, which leads to limitations when trying to achieve more complex file-matching behavior. Specifically, includes added after an exclude do not override the exclude, resulting in unexpected or undesired results.

In scenarios where finer control over the inclusion and exclusion of files is needed, the inability to specify an ordered evaluation can prevent the desired matches. Adding an option to allow ordered evaluation of patterns would offer more intuitive control, empowering developers to achieve more precise and flexible file selection.

Originally posted by @Benjin #21362 (comment)

Current Behavior

Consider the following setup in Matcher:

Matcher m = new();
m.AddInclude("**/*");
m.AddExclude("ExcludeMe/**/*");
m.AddInclude("ExcludeMe/ButActuallyIncludeMe/**/*");

Given this directory structure:

root/
  - helloWorld.txt
  - ExcludeMe/
      - notIncluded.txt
      - ButActuallyIncludeMe/
          - hiEarth.txt

The output is currently:

  • root/helloWorld.txt
  • (Does NOT include root/ExcludeMe/ButActuallyIncludeMe/hiEarth.txt)

Expected Behavior

With ordered evaluation, the output would instead be:

  • root/helloWorld.txt
  • root/ExcludeMe/ButActuallyIncludeMe/hiEarth.txt

Revised Proposal

The matcher is just a builder that calls into MatcherContext internally to do the work. That class takes in the include and exclude patterns separately so change should be made at that level as well.

The ordering is a configuration of the Matcher, so logically it should be passed into the Matcher (either in the constructor or as a property/field) instead of as an argument to Execute. This is the API I would suggest:

Proposed API

namespace Microsoft.Extensions.FileSystemGlobbing;

public class Matcher
{
    public Matcher() { }
    public Matcher(StringComparison comparisonType) { }
+    public Matcher(bool ordered) { }
+    public Matcher(StringComparison comparisonType, bool ordered) { }
}
namespace Microsoft.Extensions.FileSystemGlobbing.Internal;

/// <summary>
/// This API supports infrastructure and is not intended to be used
/// directly from your code. This API may change or be removed in future releases.
/// </summary>
public class MatcherContext
+    : MatcherContextBase
{
    public MatcherContext(
        IEnumerable<IPattern> includePatterns,
        IEnumerable<IPattern> excludePatterns,
        DirectoryInfoBase directoryInfo,
        StringComparison comparison) { }
}
namespace Microsoft.Extensions.FileSystemGlobbing.Internal;

public class OrderedMatcherContext
    : MatcherContextBase
{
    public OrderedMatcherContext(
        IEnumerable<SelectionPattern> patterns,
        DirectoryInfoBase directoryInfo,
        StringComparison comparison) { }
}

public abstract class MatcherContextBase
{
    protected abstract PatternTestResult MatchPatternContexts<TFileInfoBase>(
        TFileInfoBase fileinfo, Func<IPatternContext, TFileInfoBase, PatternTestResult> test)
}

public struct SelectionPattern
{
    public IPattern pattern;
    public SelectionType selectionType;
}

public enum SelectionType
{
    Include,
    Exclude
}

Usage

var m = new Matcher(true);
m.AddInclude("**/*");
m.AddExclude("ExcludeMe/**/*");
m.AddInclude("ExcludeMe/ButActuallyIncludeMe/**/*");

var result = m.Execute(new DirectoryInfoWrapper(new DirectoryInfo("root")));
foreach (var file in result.Files)
{
    Console.WriteLine(file.Path);
}
Original Proposal

API Proposal

To introduce ordered evaluation without altering existing behavior by default, we propose adding an optional ordered parameter to the Execute method. This would allow users to explicitly enable ordered evaluation when required.

Proposed API Changes

public virtual Microsoft.Extensions.FileSystemGlobbing.PatternMatchingResult Execute(
-  DirectoryInfoBase directoryInfo);
+  DirectoryInfoBase directoryInfo, bool ordered = false);
  • Execute Method Parameter: Adds an ordered parameter to the Execute method, allowing ordered evaluation to be enabled per execution while keeping current behavior as the default.

API Usage

// Execute with ordered evaluation
Matcher m = new();
m.AddInclude("**/*");
m.AddExclude("ExcludeMe/**/*");
m.AddInclude("ExcludeMe/ButActuallyIncludeMe/**/*");

var result = m.Execute(new DirectoryInfoWrapper(new DirectoryInfo("root")), ordered: true);
foreach (var file in result.Files)
{
    Console.WriteLine(file.Path);
}

Expected output:

  • root/helloWorld.txt
  • root/ExcludeMe/ButActuallyIncludeMe/hiEarth.txt

Alternative Designs

  1. Matcher Constructor with Ordered Parameter:
    Another approach could be adding an ordered parameter to the Matcher constructor to set the evaluation order for the entire lifecycle of the Matcher instance, rather than on a per-execution basis:

    public Matcher(bool ordered = false);
    public Matcher(StringComparison comparisonType, bool ordered = false);

    This approach makes ordered a property of the Matcher instance, reducing the need to specify it with each execution. However, this would make it less flexible for scenarios where a developer may want to toggle ordered evaluation for different executions.

  2. Separate ExecuteInOrder Method:
    Another alternative is to create a separate method, ExecuteInOrder, which explicitly indicates ordered evaluation:

    public virtual PatternMatchingResult ExecuteInOrder(DirectoryInfoBase directoryInfo);

    This approach would keep Execute unchanged and provide a clear distinction in the API. However, it would increase the API surface.

Risks

Enabling ordered evaluation may introduce a minor performance cost due to tracking evaluation order. However, this impact is expected to be minimal and only relevant when ordered is set to true.

@dotnet-issue-labeler dotnet-issue-labeler bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Oct 31, 2024
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Oct 31, 2024
@kasperk81
Copy link
Contributor Author

repro https://dotnetfiddle.net/qawrpG

@vcsjones vcsjones added area-Extensions-FileSystem and removed needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners labels Oct 31, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-extensions-filesystem
See info in area-owners.md if you want to be subscribed.

@jeffhandley jeffhandley added this to the Future milestone Nov 3, 2024
@jeffhandley jeffhandley added needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration and removed untriaged New issue has not been triaged by the area owner labels Nov 3, 2024
@jeffhandley
Copy link
Member

Changing the existing functionality would be too risky as it would result in including or excluding files differently from before, which could cause data loss and other serious issues for users. If we wanted to entertain this, it would need to be presented as an api-suggestion Early API idea and discussion, it is NOT ready for implementation .

@PranavSenthilnathan Can you see if there's a way to achieve the desired result with the existing APIs and current functionality? It looks like AddInclude, AddExclude, and Execute are all virtual and MatcherExtensions.Match could be used as the workhorse.

@kasperk81
Copy link
Contributor Author

Changing the existing functionality would be too risky as it would result in including or excluding files differently from before, which could cause data loss and other serious issues for users. If we wanted to entertain this, it would need to be presented as an api-suggestion Early API idea and discussion, it is NOT ready for implementation .

@jeffhandley yI have converted first post to api proposal. please add that label

@PranavSenthilnathan
Copy link
Member

The matcher is just a builder that calls into MatcherContext internally to do the work. That class takes in the include and exclude patterns separately so change should be made at that level as well.

The ordering is a configuration of the Matcher, so logically it should be passed into the Matcher (either in the constructor or as a property/field) instead of as an argument to Execute. This is the API I would suggest:

Proposed API

namespace Microsoft.Extensions.FileSystemGlobbing;

public class Matcher
{
    public Matcher() { }
+    public Matcher(bool ordered) { }
}
namespace Microsoft.Extensions.FileSystemGlobbing.Internal;

/// <summary>
/// This API supports infrastructure and is not intended to be used
/// directly from your code. This API may change or be removed in future releases.
/// </summary>
public class MatcherContext
+    : MatcherContextBase
{
    public MatcherContext(
        IEnumerable<IPattern> includePatterns,
        IEnumerable<IPattern> excludePatterns,
        DirectoryInfoBase directoryInfo,
        StringComparison comparison) { }
}
namespace Microsoft.Extensions.FileSystemGlobbing.Internal;

public class OrderedMatcherContext
    : MatcherContextBase
{
    public OrderedMatcherContext(
        IEnumerable<SelectionPattern> patterns,
        DirectoryInfoBase directoryInfo,
        StringComparison comparison) { }
}

public abstract class MatcherContextBase
{
    protected abstract PatternTestResult MatchPatternContexts<TFileInfoBase>(
        TFileInfoBase fileinfo, Func<IPatternContext, TFileInfoBase, PatternTestResult> test)
}

public struct SelectionPattern
{
    public IPattern pattern;
    public SelectionType selectionType;
}

public enum SelectionType
{
    Include,
    Exclude
}

Usage

var m = new Matcher(true);
m.AddInclude("**/*");
m.AddExclude("ExcludeMe/**/*");
m.AddInclude("ExcludeMe/ButActuallyIncludeMe/**/*");

var result = m.Execute(new DirectoryInfoWrapper(new DirectoryInfo("root")));
foreach (var file in result.Files)
{
    Console.WriteLine(file.Path);
}

@jeffhandley
Copy link
Member

@PranavSenthilnathan I updated the issue description to reflect your revised proposal, and I'm marking this as api-ready-for-review API is ready for review, it is NOT ready for implementation and help wanted [up-for-grabs] Good issue for external contributors for the Future milestone.

@jeffhandley jeffhandley added help wanted [up-for-grabs] Good issue for external contributors api-ready-for-review API is ready for review, it is NOT ready for implementation and removed needs-further-triage Issue has been initially triaged, but needs deeper consideration or reconsideration labels Feb 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-ready-for-review API is ready for review, it is NOT ready for implementation area-Extensions-FileSystem help wanted [up-for-grabs] Good issue for external contributors
Projects
None yet
Development

No branches or pull requests

4 participants