Skip to content
This repository was archived by the owner on Sep 7, 2021. It is now read-only.

Refactor ingredient handler error logging #385

Closed
wbamberg opened this issue Mar 31, 2020 · 4 comments
Closed

Refactor ingredient handler error logging #385

wbamberg opened this issue Mar 31, 2020 · 4 comments

Comments

@wbamberg
Copy link

Currently we have one utility function for ingredient handlers to log errors, logMissingIngredient. This takes three arguments:

  • source: which in the ingredient handlers is always "html-require-ingredient". This should have a more obvious name.
  • recipeName: the name of the recipe
  • ingredient: the name of the ingredient

It logs a message like "Missing from ${recipeName}: ${ingredient}", and an identifier that combines source, recipeName, and ingredient.

This is called from ingredient handlers when a mandatory ingredient can't be found. Often, this call is preceded by another logging call like this:

const message = file.message(
  `Expected h2#${id}`,
  body,
  `${context.source}:${context.recipeName}/${context.ingredient}/expected-heading`
);
message.fatal = true;
utils.logMissingIngredient(file, context);

So I guess the idea here is that the first message provides details of exactly why we couldn't find the ingredient, and the second message provides an overview of what the problem was (i.e. which ingredient was missing).

Issues

There are a few issues here.

Duplicated "first message" code

There's a very repetitive pattern in the many places where we log two messages:

const message = file.message(
  detailedMessage,
  node,
  `${context.source}:${context.recipeName}/${context.ingredient}/${details}`
);
message.fatal = true;
utils.logMissingIngredient(file, context);

Apart from just making code longer and harder to read, this is going to make it very hard for us to log messages consistently. So we should have a utility function for this.

"first message" is not consistently called

As we saw, usually there are two messages: one that gives specifics about why the linting failed (like, a particular macro could not be found), and one that gives a general messages about the ingredient that was missing.

Sometimes the first message is not logged, and only logMissingIngredient is called. For example: https://github.com/mdn/stumptown-content/blob/master/scripts/scraper-ng/rules/html-require-recipe-ingredients/ingredient-handlers/index.js#L61-L63 or https://github.com/mdn/stumptown-content/blob/master/scripts/scraper-ng/rules/html-require-recipe-ingredients/ingredient-handlers/data-browser-compatibility.js#L34-L36.

Is this by design, or should we always log two messages? I don't see why we log a specific message when the specName macro is missing but not when the Compat macro is missing.

Maybe we should only log one message, which always gives the extra detail?

Passing state around

Because the logging functions want lots of arguments, that's a lot of state we have to pass around. The more handlers we write, and the more complex they get, the more this is going to be an issue.

logMissingIngredient doesn't work for badly-formed ingredients

logMissingIngredient works for mandatory ingredients. But now we want to lint optional ones, like data.static_methods. There, we probably want to log something like "Incorrect ingredient" rather than "Missing". Should we add an extra parameter to logMissingIngredient, so we can do that, and rename it?

@ddbeck
Copy link
Contributor

ddbeck commented Apr 1, 2020

Thanks for writing this up, Will. I think you've captured all the big issues with messages here.

On the topic of inconsistently calling "first message":

Is this by design, or should we always log two messages?

This is by design and we should probably always log two messages (i.e., the missing Compat message call is a bug), when we find evidence of a malformed ingredient. We discussed this a bit in the PR where this pattern was introduced. Your comment here captures a bit of the thinking behind it: #320 (review). But to get into this in more detail…

I'd argue we should have two messages for malformed ingredients, for two reasons:

  1. I think the two messages are saying distinct things. The general "ingredient missing" message says, "this page doesn't satisfy this ingredient." The specific "expected" message says, "we saw something that looked ingredient-ish (e.g., an ingredient heading) but it ultimately did not satisfy the ingredient".

  2. Having a general "this ingredient is missing" message makes it easier to learn about problems in aggregate.

    For example, if we emit two errors, I can easily count up all the pages with missing "data.specifications" ingredients by counting all the messages for the missing ingredient. If we only emit one error, then I have to sum up all the errors relating to missing ingredients (e.g., no heading whatsoever, missing spec macro, and, perhaps later, errors about extraneous content in the specification section).

Of course, you might have already addressed this in the refactoring PR. I'll start reviewing that now.

@wbamberg
Copy link
Author

wbamberg commented Apr 1, 2020

This is by design and we should probably always log two messages (i.e., the missing Compat message call is a bug), when we find evidence of a malformed ingredient.

"when we find evidence of a malformed ingredient" - is the suggestion:

(1) that we should log two messages except when the ingredient is "just missing", in which case we should only log one?

Or

(2) that we should always log two messages?

If (1) , I worry that "just missing" is a slippery concept. Generally it seems to mean "doesn't even have an H2" but really that seems a little artificial. If it has h2#Specifications but no {{ SpecName }}, is it missing or malformed, and is this a helpful distinction anyway? As a maintainer, what I most care about is: what is the specific problem I need to fix? And in this case, it's the missing {{ SpecName }}. Similarly, if a page has h2#Browser_Compatibility is the ingredient missing or malformed?

If (2), I'm more happy with that, but I'm not sure the generic "Missing..." message adds a lot. For example, with the current linter, the output for https://wiki.developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/entries is:

           1:1  error    Missing from javascript-method: prose.description                             javascript-method/prose.description                                      html-require-ingredient
           1:6  warning  Macro: JSRef                                                                  jsref                                                                    html-no-macros
         15:10  warning  Macro: jsxref                                                                 jsxref                                                                   html-no-macros
(more jsxref warnings snipped)

The error is because the recipe lists prose.description as a mandatory ingredient, but doesn't have h2#Description. We just log this as:

Missing from javascript-method: prose.description javascript-method/prose.description.

As a maintainer of this doc, I'd prefer a more specific message, telling me that we looked for h2#Description and couldn't find it - then I know what was wrong and how to fix it. So should we be logging two errors here or just the one? If we log something like:

Expected h2#Description javascript-method/prose.description/missing-heading

... does the extra Missing from... message add anything?

Especially if the ID is consistently formed like recipe/ingredient/problem, then we can always find failed ingredients by looking at the first two pieces of the ID.

We discussed this a bit in the PR where this pattern was introduced. Your comment here captures a bit of the thinking behind it: #320 (review).

Yes, and I'm sorry to relegislate this. I guess I have just thought about this a bit more and have more opinions.

I think the two messages are saying distinct things. The general "ingredient missing" message says, "this page doesn't satisfy this ingredient." The specific "expected" message says, "we saw something that looked ingredient-ish (e.g., an ingredient heading) but it ultimately did not satisfy the ingredient".

I think there's something in this, but I"m not sure how would this looks for optional ingredients. Here, a "Missing..." error message is not applicable of course (although actually, I would be more likely to want an info message for these, like "optional ingredient XYZ omitted", than an error message for missing mandatory ingredients). So if a page attempts to include an ingredient, according to our heuristic (e.g. it includes an h2#Static_methods) but the content is malformed, would we emit 2 messages, like:

Malformed ingredient: javascript-class:data.static_methods    javascript-class/data.static_methods 
dt items must be links:  javascript-class:data.static_methods    javascript-class/data.static_methods/dl-items-links

...or something like that?

So I guess, with these questions, it feels simpler to log just one message, try to make it as actionable as possible, and make sure it always refers to a recipe/ingredient combination.

@ddbeck
Copy link
Contributor

ddbeck commented Apr 2, 2020

So I guess, with these questions, it feels simpler to log just one message, try to make it as actionable as possible, and make sure it always refers to a recipe/ingredient combination.

OK, I'm convinced. I'd still like a quick way to aggregate recipe problems, but I guess this is something I should fix with a shell script instead of the rules themselves. 👍

@wbamberg
Copy link
Author

Fixed by #386.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants