Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New: Processor improvements #3

Merged
merged 10 commits into from
Jan 2, 2019
Merged

New: Processor improvements #3

merged 10 commits into from
Jan 2, 2019

Conversation

nzakas
Copy link
Member

@nzakas nzakas commented Nov 20, 2018

Summary

This proposal provides a way to explicitly define which processor(s) to use for different files inside of configuration. It also allows the chaining of multiple processors to fully process a file.

Related Issues

@nzakas nzakas added the enhancement New feature or request label Nov 20, 2018
Copy link
Member

@platinumazure platinumazure left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One typo, and raised a few questions.

@mysticatea
Copy link
Member

Thank you for this proposal :)

Sounds nice to me.
I think that plugins for other kinds than JavaScript may not work without processors. For example, eslint-plugin-vue uses the postprocess API to provide <!-- eslint-disable -->-like directive comment functionality. If the processor is missing, the users will see cryptic error messages. But plugins can contain configuration presets, so it will be no problem even if the end-users need additional setting.

But I have a concern.

  • how do users configure processors to extract a specific kind of blocks?

Some kind of files can contain various kinds of code blocks. such as markdown.
eslint-plugin-markdown wants to extract only JS blocks. But in the original issue, users want to configure to extract other blocks than JS. This proposal doesn't look to include the way to configure it.

For example:

// .eslintrc.js
module.exports = {
    overrides: [
        {
            files: "*.md",
            processors: ["markdown/markdown"], // want markdown processor to extract JS blocks
        },
        {
            files: "*.md",
            processors: ["markdown/markdown"], // want markdown processor to extract TS blocks
            parser: "typescript-eslint-parser", 
        },
        {
            files: "*.md",
            processors: ["markdown/markdown", "html/html"], // want markdown processor to extract HTML blocks
        },
        {
            files: "*.md",
            processors: ["markdown/markdown", "vue/vue"], // want markdown processor to extract Vue blocks
            parser: "vue-eslint-parser", 
        },
    ]
}

May the eslint-plugin-markdown author has to define processors by the number of block kinds?

@mysticatea
Copy link
Member

mysticatea commented Nov 20, 2018

After I re-looked my comment, I'm not sure that setting works well because those processors conflict. But I think, that is the issue we want to solve.

@mysticatea
Copy link
Member

Ah, mmm, #1 also cannot solve that problem properly, because #1 will implement recursive calls in Linter#verify, but resolving overrides setting does in CLIEngine. I'm thinking...

Copy link
Member

@ilyavolodin ilyavolodin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposal sounds good in general, but I would like an additional paragraph explaining how the new configuration with overrides will affect having to pass -ext flag to the command line. Based on the context, it looks like the status quo remains. We've had number of requests to change that behavior, but due to the design limitation of file traversal - we can't.

@nzakas
Copy link
Member Author

nzakas commented Nov 21, 2018

@mysticatea

how do users configure processors to extract a specific kind of blocks?

It's not clear from the proposal, but this intentionally does not solve that problem because I don't believe that's what eslint/eslint#11035 is asking for or needs. (I will update the proposal to make it this clear.)

So far, the only plausible use case would be Markdown with different embedded code fragments inside, maybe some that are JS and some that are TypeScript, etc. I think that this is a corner case to the larger issue of making processors end-user configurable. I would suggest we don't focus on solving that case right now.

That said, I don't believe that this proposal prevents a solution like #1 from being implemented at some point in the future. As you stated, the processing would have to move to outside of Linter into CLIEngine so that patterns could be match for the virtual filenames. If we find that we really do need this capability, then we can always go down that route later (and use overrides to match the virtual filenames to appropriate configs).

@nzakas
Copy link
Member Author

nzakas commented Nov 21, 2018

@ilyavolodin you are correct, this does not change how --ext works. I'll add a note to make that clear.

@nzakas
Copy link
Member Author

nzakas commented Nov 21, 2018

@mysticatea I'm sorry, I completely misunderstood our earlier discussion. You can pretty much ignore everything in my previous comment. Let me get in the latest updates and I'll revisit.

@nzakas
Copy link
Member Author

nzakas commented Nov 21, 2018

Okay, after looking this over again, I realize I completely missed the point that we need to be able to derive virtual filenames for extracted code blocks in this proposal. My apologies for creating any confusion.

I'm going to update this proposal to include @mysticatea's approach of providing a virtual filename for extracted code blocks as I think that's clearly the best way to solve that part of the problem. I think I can add that feature into this proposal without any significant changes, and if @mysticatea likes the updated proposal, I'd suggest we work together on a single proposal moving forward.

@nzakas
Copy link
Member Author

nzakas commented Nov 21, 2018

Okay, I've updated the proposal to take everyone's feedback into account. I added a bit about virtual filenames from @mysticatea's proposal and how the current implementation of processors would have to change. I think I now properly understand what we're going for, thanks for your patience and help.

@mysticatea
Copy link
Member

@nzakas Merging is a good idea. I have been thinking #1 should use overrides[].processor instead of processor API's extensions.

@nzakas
Copy link
Member Author

nzakas commented Nov 22, 2018

I've incorporated the latest round of feedback including:

  • Change from processors to processor, as multiple processors will now be applied by matching the file extension of code blocks.
  • Clarified that Linter functionality will not be removed, only worked around in the CLI.
  • Fixed typos and clarified some points.

Outstanding issues:

  • It seems like we don't want to be able to specify a full file path for code blocks. The question is, what do we want in its place? Should we only allow processors to specify extensions? Should we allow processors to specify full filenames and disallow \ characters?

Copy link
Member

@btmills btmills left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to this specific RFC, I’m really liking the process. This is so much more efficient than iterating on an issue - nice work @nzakas!


This processor returns two code blocks containing JavaScript code. Each code block is given a virtual filename ending with `.js`.

When a `preprocess()` method returns an object with a `filePath` property, `CLIEngine` will call `getConfigForFile()` on the `filePath` property to determine the correct configuration for the code block, which includes whether another processor should be run on the code block (matched by file extension).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least in the case of the Markdown processor, since it can’t know which extensions have processors configured, I understand it will want to return every code block it encounters? For example, if original.md contains js, html, and vue code blocks, it would return original.md.js, original.md.html, and original.md.vue. Do we then need to run those through the --ext filter as well? Perhaps I invoke with --ext .html, so I’d expect the html code block to be linted, but not the vue code block.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At initial, I had considered the same thing: we should filter extracted blocks with the --ext or glob patterns. However, I think currently, if a specific file specified (E.g., eslint docs/README.md), I'm not sure what the proper behavior is. I will be confused if ESLint changes the target code blocks by globs. So I want options for processors in config files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think we want every processor to have to implement its own version of --ext. Could we encourage processors to return all code blocks it thinks can be linted with ESLint, and then filter the results in CLIEngine based on the value of --ext before applying glob matching?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that makes the most sense. The Markdown processor would return every fenced code block, and the HTML processor would return every <script> tag, and ESLint core would be responsible for deciding which of those get linted.

How that interacts with --ext I’m not sure. I suppose you could argue that if I configure an override for *.ts, then I shouldn’t also have to specify --ext .ts. When plugins provided their own extensions, the --ext flag was necessary, but flipping the relationship around so that the extensions are configured in overrides might make it redundant.

Is appending an extension to the existing filename the best way? I think so, but I wanted to point out that it’s trivial for the Markdown processor to pull the tag from the fenced code block, but an HTML processor would need to map from the <script>’s MIME type to the appropriate extension.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't oppose the way.

My small concern is, some users use glob patterns rather than --ext option to specify other files than JS (E.g., src/**/*.{js,ts,tsx}). But we can say that use --ext option to specify embedded codes other than JS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, it seems like we are in agreement that:

  1. Processors can return all file types it thinks ESLint can handle.
  2. ESLint will filter out any file types that aren't specified by --ext to avoid errors.
  3. Users must use --ext even if they have overrides in their config; overrides is not a replacement for --ext (in the context of this proposal).

@btmills I think appending the file extension makes sense. My hunch is that most processors will only be returning .js files any way. As @mysticatea has pointed out a couple of times, we are really talking about an edge case with non-.js code blocks being returned from processors.

@nzakas
Copy link
Member Author

nzakas commented Nov 23, 2018

I've incorporated the latest round of feedback including:

  • Change filePath to filename for code blocks.
  • filename in code blocks cannot contain slash characters
  • Clarified how we will treat code blocks without a filename specified
  • Fixed typos.

Outstanding issues:

  • How does this proposal interact with the --ext flag?
  • Where should the processing take place in order to expose this functionality to browsers?

@mysticatea
Copy link
Member

How does this proposal handle autofix?
If some of the processors don't support autofix, broken fix properties will mix in the result. #1 mentioned it and removing the broken fix properties.

@nzakas
Copy link
Member Author

nzakas commented Nov 26, 2018

@mysticatea oh, I guess I thought autofixing would "just work." :) Let me read back over that part of your proposal and see what I missed.

@nzakas
Copy link
Member Author

nzakas commented Nov 26, 2018

Okay, I don't think I understand what's going on with processor autofixing. It's the processor that's responsible for applying fixes? What would change about that with this proposal?

(Sorry, processor autofixing happened when I was away so I just have no context for how that interacts with anything.)

@mysticatea
Copy link
Member

@nzakas The processor is the pair of preprocess and postprocess, so it will be:

function verify(filename, code) {
    //....(resolve config)....
    let shouldAutofix = true
    const messagesList = preprocess(code, filename).map(item => {
        if (typeof item === "string") {
            return linter.verify(code, config, { ...options, filePath: filename })
        }

        // Recursive call.
        const retv = verify(item.filename, item.code)

        // Take the flag.
        if (!retv.shouldAutofix) {
            shouldAutofix = false
        }
        return retv.messages
    })

    return {
        shouldAutofix,
        messages: postprocess(messagesList, filename)
    }
}

As a reference, stopping creating fix:

function verify(filename, code) {
    //....(resolve config)....
    return postprocess(
        preprocess(code, filename).map(item => {
            if (typeof item === "string") {
                return linter.verify(code, config, {
                    ...options,
                    filePath: filename,
                    disableConstructFix: !supportsAutofix
                })
            }
            // Recursive call.
            return verify(item.filename, item.code)
        }),
        filename,
    )
}

Actually, the disableConstructFix flag is unnecessary if the overrides resolving logic exists in Linter because Linter knows when it should not construct the fix property.

@nzakas
Copy link
Member Author

nzakas commented Dec 4, 2018

Ah ok, I think I see what you're getting at. I (reluctantly) agree that adding the flag to Linter#verify() to disable creating of fix makes sense.

So that verify() function in your example would end up in CLIEngine?

@mysticatea
Copy link
Member

So that verify() function in your example would end up in CLIEngine?

Yes 😃

@nzakas
Copy link
Member Author

nzakas commented Dec 5, 2018 via email

@nzakas
Copy link
Member Author

nzakas commented Dec 6, 2018

I've updated the proposal with the latest implementation changes that @mysticatea and I discussed.

I think the last open issue is whether the vue-eslint-parser case (#3 (comment)) is breaking or not. @mysticatea can you clarify your thinking on that?

@nzakas
Copy link
Member Author

nzakas commented Dec 11, 2018

I've updated the design to address the edge cases @mysticatea pointed out. I believe that this design is now ready for a final review.


When a `preprocess()` method returns an object with a `filename` property, a new filename is constructed in this format:

> parentFilename + path.sep + index + filenameFromPreprocess
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on this, if I wanted to configure globals or rules just for code found in Markdown files, I would be able to do that by using the *.md/* pattern?

module.exports = {
    plugins: ["markdown"],
    overrides: [
        {
            files: ["*.md"],
            processor: "markdown/markdown"
        },
        {
            files: ["*.md/*"],
            globals: {
                foo: true
            },
            rules: {
                strict: "off"
            }
        }
    ]
}

To further confirm my understanding, any configuration defined in the *.md override would not cascade down to the code blocks because it lacks a trailing *? I think that's the correct behavior, though it might seem unintuitive at first if someone is used to .eslintrc configurations cascading. overrides just work differently than config cascades.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That lets us give the ability to configure rules for each kind of code blocks.


## Backwards Compatibility Analysis

This proposal is 100% backwards compatible until we remove the old way of defining processors. Both named and extension-based processors can be defined in the same plugin, such as:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've spent a while trying to figure this out, so please correct me if I've missed something. I think that this will require a major version bump when plugins start to support this API.

Currently, as well as after this change in the case of extension-named processors, code blocks inherit their parent file's filename. For example, project/README.md would contain blocks named project/README.md. In the Markdown plugin's readme right now, we recommend configuring rules etc. using *.md in overrides.

When a processor is upgraded to support the named processor API, those code blocks would become project/README.md/0.js, and any rules configured for *.md files would no longer be applied to the code blocks inside of them.

Individually, the eslint adding support for this API won't break the existing (now called extension-named) processors, and eslint-plugin-markdown can start exporting both extension-named and named processors without breaking versions of ESLint that don't support the new API. But taken together, a version of ESLint that supports extension-named processors and an extension-named processor could require me to modify my configuration if I'm specifically targeting code blocks returned by a processor.

Am I lost in the trees here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, but only if the processor is returning filenames with the code blocks. If the processor is still returning strings, then the code blocks get the same name as the parent file. And when ESLint is given string code blocks, it would stop matching against the code block filenames because that would be an infinite loop.

So, it would be a breaking change for the plugin if it changes from returning strings to returning filenames, but not a breaking change for ESLint.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I’m happy to use the Markdown plugin to test the recommended upgrade path for processors once this is available in ESLint.

Co-Authored-By: nzakas <nicholas@nczconsulting.com>
@nzakas
Copy link
Member Author

nzakas commented Dec 27, 2018

@ilyavolodin do you have any further concerns? At this point, you're the only one that has suggested changes outstanding.

@ilyavolodin
Copy link
Member

Ah, sorry about that. I lost track of this proposal and didn't have much time to recheck it. Changes looks good to me. Thanks.

@nzakas
Copy link
Member Author

nzakas commented Jan 2, 2019

It looks like we have our first merged RFC! 🎉 Thanks everyone!

@nzakas nzakas merged commit 0f6b17a into master Jan 2, 2019
@nzakas nzakas deleted the 2018-processors branch January 2, 2019 15:37
@not-an-aardvark
Copy link
Member

For future reference, do we need TSC approval to merge RFCs like this? I have no problem with it being merged either way (and I see that a majority of TSC members approved the RFC anyway), but I was a bit confused because I thought it would need to be approved in a TSC meeting.

@nzakas
Copy link
Member Author

nzakas commented Jan 3, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request implemented This RFC has been implemented
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants