Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(lint): Implement front matter validator and linter #26410

Merged
merged 4 commits into from
Apr 28, 2023

Conversation

OnkarRuikar
Copy link
Contributor

Moving mdn/yari#8295 to content repo.

@OnkarRuikar OnkarRuikar requested review from schalkneethling and a team as code owners April 25, 2023 10:34
@OnkarRuikar OnkarRuikar requested review from bsmth and removed request for a team April 25, 2023 10:34
Copy link
Collaborator

@wbamberg wbamberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for putting this together, it will be great to have validation for front matter.

I've read through the code and done some basic testing, and will try to make another pass through the code tomorrow. Since this is quite a complicated change it would be good to have another pair of eyes on it as well as mine.

A few general comments as well as the specific ones inline.

Unlike the version of this in mdn/yari#8295, this PR doesn't integrate the linter into CI. I think we should have that. In fact, I can imagine the linter being used in a couple of ways:

  1. initially, over all the files, to get us into a good state (which it looks like we are already, thanks to your previous PRs)
  2. in CI, over changed files only, to keep us in a good state. I think it's fine (from a contributor experience point of view) for front matter validation failures to be CI errors.

I'm a bit worried that it won't be practical to have this in CI until we have fixed Yari so it doesn't break front matter with commands like yarn content move. I know you have mdn/yari#8623 for part of this, and I think mdn/yari#7825 (comment) is also needed. Should we consider those things blockers for this PR, or just go ahead?

scripts/front-matter-linter.js Outdated Show resolved Hide resolved
scripts/front-matter-linter.js Outdated Show resolved Hide resolved
scripts/front-matter-linter.js Outdated Show resolved Hide resolved
scripts/front-matter-linter.js Outdated Show resolved Hide resolved
scripts/front-matter-linter.js Outdated Show resolved Hide resolved
scripts/front-matter-linter.js Outdated Show resolved Hide resolved
scripts/front-matter-linter.js Outdated Show resolved Hide resolved
scripts/front-matter-linter.js Outdated Show resolved Hide resolved
scripts/front-matter-linter.js Outdated Show resolved Hide resolved
@teoli2003
Copy link
Contributor

teoli2003 commented Apr 26, 2023

I wonder if we also want to add reordering of YAML entries in Husky. But it shouldn't block this. (It would also work around any problem yarn tools would have)

@OnkarRuikar OnkarRuikar marked this pull request as draft April 26, 2023 05:41
@OnkarRuikar
Copy link
Contributor Author

I wonder if we also want to add reordering of YAML entries in Husky. But it shouldn't block this.

Unlike the version of this in mdn/yari#8295, this PR doesn't integrate the linter into CI.

Following CI cases have been handled:

  • Husky git pre-commit (see .lintstagedrc.json
    • Front-matter's of staged files will be linted
    • Unit tests will be run if any file under /tests/* directory modified.
  • As part of PR workflow, front-matter of only modified files will be linted after markdownlint (see pr-check_markdownlint.yml)
  • Along with daily cron job for markdownlint, front-matters of all the files will be validated. (see markdown-lint-fix.yml)

I'm a bit worried that it won't be practical to have this in CI until we have fixed Yari so it doesn't break front matter with commands like yarn content move. I know you have mdn/yari#8623 for part of this, and I think mdn/yari#7825 (comment) is also needed. Should we consider those things blockers for this PR, or just go ahead?

These are not blockers. During move when yari changes to single quote, husky will run the front-matter linter before the git commit to revert quote changes.
Also, the linter won't complain when Yari removes short-title during move command because the attribute is optional. Anyway, till the bug gets fixed, PR authors have to add the attribute back after running move command.


Since this is quite a complicated change it would be good to have another pair of eyes on it as well as mine.

I completely lost the context after two months since the yari PR, all the research done forgotten. If we can't get the other eyes on this PR sooner we'll be back to square one. 😞
To get the ball rolling we only have to write and pass the unit tests. If anyone wants they can always enhance the code later.

@github-actions github-actions bot added Content:CSS Cascading Style Sheets docs Content:Glossary Glossary entries Content:WebAPI Web API docs labels Apr 27, 2023
@github-actions
Copy link
Contributor

Preview URLs

Flaws (1)

Note! 2 documents with no flaws that don't need to be listed. 🎉

URL: /en-US/docs/Web/CSS/CSS_Scrollbars
Title: CSS scrollbars styling
Flaw count: 1

  • macros:
    • /en-US/docs/Glossary/scrollport redirects to /en-US/docs/Glossary/Scroll_container

@OnkarRuikar OnkarRuikar marked this pull request as ready for review April 27, 2023 07:09
@OnkarRuikar OnkarRuikar requested review from a team as code owners April 27, 2023 07:09
@OnkarRuikar OnkarRuikar requested review from wbamberg and estelle and removed request for a team April 27, 2023 07:09
Copy link
Collaborator

@wbamberg wbamberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have a couple of concerns about this PR which I will list here. But I'm also aware that you've been messed around with this PR for long enough and I don't want to draw things out more than is needed. So I'm approving, but I'm very happy for other MDN maintainers (@teoli2003 and @bsmth among others) to chime in on these issues, and if other maintainers don't think these concerns should stop us merging this PR then I'm happy to go along with that.

Dependencies

Some of these are essential for validation: better-ajv-errors, ajv, ajv-formats, gray-matter.
js-yaml is essential for auto-fix, which I think we need. jest is needed for testing, which I think we need too given that the validation is quite complicated.

The rest are a bit more arguable: caporal, cli-progress, fdir, async seem to be either nice-to-have or performance enhancements, which I'm not that convinced we need if we are just running this on changed files. But if noone else thinks this I'm happy to accept them.

"Deep validation" of page types

As said in #26410 (comment), I think I am -1 on validating page types inside the top-level categories, like https://github.com/mdn/content/pull/26410/files#diff-598136f55d381642ea8527a0da53ec7177903514d7ce499a242ce89cbe359173R243-R307. The reason is just that it makes the config much more complicated and much less useful as a kind of document of the page type rules, and it also seems arbitrary to do it for some areas (HTTP, MathML, Accessibility) and not for the others.

That said, we could double-down on this, and so use page type validation as a way to enforce IA within content areas, and that might be pretty powerful. But I'd like that to be a deliberate choice.

@hamishwillee
Copy link
Collaborator

@OnkarRuikar Thank you - this looks good. I don't have particularly strong opinions other than:

  1. What's the rollout plan? I'd feel more confident of this after it's been run across our current doc set for all files.
  2. I disagree with @wbamberg on the deep page-type validation. Yes it is complicated. Yes it is annoying that it doesn't cover everything. But I think it is important to get data right.

(Uninformed) thoughts:

  • Can schema be broken up to make it more palatable? I.e. perhaps break it into files to make it easier to digest and be clear what is not covered?
  • If it is a blocker, we could split validation into a different PR - for this one just validate that if it is defined it is one of the agreed set for the area. That would catch typos.
  1. No opinion on the selection of modules etc.

Thank you again.

@OnkarRuikar
Copy link
Contributor Author

OnkarRuikar commented Apr 28, 2023

The rest are a bit more arguable: caporal, cli-progress, fdir, async seem to be either nice-to-have or performance enhancements, which I'm not that convinced we need if we are just running this on changed files.

The daily cron job lints all the files to catch any escaped indentations/bugs. It's better to keep the option of linting all the files.

The mdn/content repo is not packaged and published to any npm repositories. All the dependancies in the package.json at the moment are actually dev dependancies. There is no harm in keeping them.


"Deep validation" of page types

This shouldn't block the PR. We can have vote or discussion about this on GitHub later. From the comment above, if a code owner wants to have finer validation in their code then they should be allowed to have it.

That said, we could double-down on this, and so use page type validation as a way to enforce IA within content areas, and that might be pretty powerful. But I'd like that to be a deliberate choice.

We could write a similar utility to enforce markdown templates based on the page-type value. It will make content more uniform.


What's the rollout plan? I'd feel more confident of this after it's been run across our current doc set for all files.

We've already linted the mdn/content using the tool:

It would be great if we merge this tool before this weekend. So if anything comes up we can fix it without blocking others.


Can schema be broken up to make it more palatable? I.e. perhaps break it into files to make it easier to digest and be clear what is not covered?

Unfortunately, prettier is inflating the if-then-else clauses too much. Even if we split it, the chunks will look the same. We can simply add the file to prettier's ignore list and do manual indentation to keep it more readable.

If it is a blocker, we could split validation into a different PR - for this one just validate that if it is defined it is one of the agreed set for the area. That would catch typos.

No need. All the incorrect page-types values and typos have been fixed already.

Copy link
Collaborator

@hamishwillee hamishwillee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what my approval is worth, but I think you have done due diligence.

@hamishwillee
Copy link
Collaborator

Unfortunately, prettier is inflating the if-then-else clauses too much. Even if we split it, the chunks will look the same. We can simply add the file to prettier's ignore list and do manual indentation to keep it more readable.

I was actually thinking along the lines of physically splitting schema files - so all the stuff around page validation would be in separate pages for each area. And yes, having it more compact would be nice.

Copy link
Contributor

@teoli2003 teoli2003 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us go ahead with this. All changes are improvements–none are irreversible– and we can already get experience about how it goes; then, we can discuss each improvement separately in individual PRs. I think the better is the enemy of the good here (the French say)

@teoli2003
Copy link
Contributor

I'm going to merge this. This is the beginning of the day in Europe, so we have 15 hours before the week-end to check there is no major problem with it.

@teoli2003 teoli2003 merged commit 307cd1c into mdn:main Apr 28, 2023
@OnkarRuikar OnkarRuikar mentioned this pull request Apr 28, 2023
@wbamberg
Copy link
Collaborator

So great to have this. Thank you @OnkarRuikar !

@teoli2003
Copy link
Contributor

Yes, I'm really excited about this. Even we were so busy that we forgot about it. THank you, @OnkarRuikar and sorry for the delay.

@OnkarRuikar OnkarRuikar deleted the fmlinter branch April 30, 2023 10:04
@bsmth
Copy link
Member

bsmth commented May 3, 2023

Thank you @OnkarRuikar for your work! 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Content:CSS Cascading Style Sheets docs Content:Glossary Glossary entries Content:WebAPI Web API docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants