Skip to content
This repository has been archived by the owner on Sep 7, 2021. It is now read-only.

Support spec tables in HTML pages #126

Closed
wbamberg opened this issue Sep 10, 2019 · 21 comments
Closed

Support spec tables in HTML pages #126

wbamberg opened this issue Sep 10, 2019 · 21 comments

Comments

@wbamberg
Copy link

HTML reference pages in MDN (and most other reference pages) contain a table linking to relevant specification(s).

We didn't add this to the HTML element recipe when we first wrote them because it wasn't very clear which specifications we should include. But we need to have spec tables in Stumptown, so this issue is to work out how to specify spec tables in the recipes and add it to the data.

Acceptance criteria:

  • all HTML recipes that need it have a spec table structure
  • all HTML data that need it has spec table data
@Elchi3 Elchi3 self-assigned this Sep 12, 2019
@Elchi3 Elchi3 added this to the Peppermint (S3 Q3 2019) milestone Sep 12, 2019
@wbamberg
Copy link
Author

For reference, here's the discussion doc on what spec tables should look like: https://docs.google.com/document/d/1eL8YtslVZAnIAGb7rcZGbvXndkcR7lb9rhDpvGpGaWs/edit?ts=5c763cb9#.

@Elchi3
Copy link
Member

Elchi3 commented Sep 27, 2019

Thanks for linking the discussion, Will! There are many ideas and issues in that document.

In BCD, we have spec_url along with mdn_url. This has landed for ECMAScript features, and Mike Smith has a fork that added more spec_urls here: https://github.com/w3c/browser-compat-data. That fork also added spec_urls to HTML features. I think I would like to try to upstream Mike's work into BCD, so that we can get spec_urls for HTML features through this route. Would you agree with that?
At a minimum, we can then take these URLs and simply reference the spec as the absolute minimum spec "table". I'd call this version 0. I think this would serve your acceptance criteria, do you think so, too? Also, "it wasn't very clear which specifications we should include", I agree, and I would double-check Mike's work as part of this. What he's done for ECMAScript features made a lot of sense (only link to the latest updated spec). So that question could be solved with this. As a plus point, all spec_urls become mass-editable.

In a follow-up, we could then experiment and get back to the ideas in the document you've linked and enrich the simple spec reference. For example, with spec status or other relevant bits for the future "spec table" (ideally we would user test these ideas). Some enrichments will need more data points than just the spec_url. Maybe something like a modern "specdata" that we currently have in KumaScript. I think that requires more work, but I think I would rather think about this stuff, when we know what the ideal spec table is and I belive only a dedicated project with user tests and having consulted some spec people would probably tell us. Like, it would probably require to create a 4-5 different spec tables from the ideas, test them with real users on the current MDN wiki pages for a few weeks and then decide. I wouldn't want get into this now, but please tell me if you think we should get into that.

@wbamberg
Copy link
Author

wbamberg commented Sep 27, 2019

Thanks Florian!

I don't believe BCD is the right place for spec data. I think BCD should be specifically about browser support for web features, and I think spec links are just not in scope for that. But it seems like I'm in a minority here, so I'm not going to keep fighting on that.

If spec data lives in BCD, then that's the authoring interface for the data itself. There's still questions about, what does this look like in stumptown-content, and in the built JSON?

For example:

  1. We could have specification(s?) as a recipe element.
  2. In an item's front matter, an author could perhaps specify it using the BCD query, like: specification(s?): css.properties.margin
  3. build-json could resolve that by loading the BCD data and writing the spec link(s) into the JSON?
  4. The renderer could work out how to render that in a nice table, or whatever presentation it likes.

The only thing I can see here is that it feels weird asking authors to supply the same BCD query twice, once for BCD and once for spec data. But I like explicitly representing the existence of spec data in stumptown-content, even if we don't represent the data itself.

I think this would serve your acceptance criteria, do you think so, too?

I'm not sure about this, and I'm not sure we are at version 0. The doc I linked is as I understand it the end of a process to define a spec table, not the beginning. Well not the end because nothing's ever the end, but it is the output of some work and consultation. It makes a specific recommendation:

  • there may be more than one spec listed
  • specs must have a URL and may also have a link to provide feedback
  • specs should not include status (I'm heartily in favour of this!)

I would (and did) argue against including the feedback link, and we could choose to revisit it, but I do think we should engage with this document in building spec data into stumptown.

Also cc @chrisdavidmills , who wrote this doc.

@Elchi3
Copy link
Member

Elchi3 commented Sep 28, 2019

Thanks for your detailed feedback, Will!

Fwiw, I also think the spec_urls have a better home in stumptown. I think BCD has been preferred by me and others because it provides an accessible place for them now as opposed to when stumptown is ready. But maybe there is no rush needed. I think Mike has already forked to make it work for his current use case, but my hope is that we would one day offer a solution that doesn't require people to do forking of our offerings.

There's still questions about, what does this look like in stumptown-content, and in the built JSON? For example [...]

I need to dive into these points some more. This is exactly the kind of territory I'm not so much familiar with anymore not having looked into stumptown in a while. I will get into this once we've settled on the general direction of this. Super helpful guidance for next steps, though! Thanks!

But I like explicitly representing the existence of spec data in stumptown-content, even if we don't represent the data itself.

I haven't thought about this, but it is a good point. I thought it would be fine if it would come in somehow via BCD (which is referenced once), but maybe you're right and it is better to have it explicitly in stumptown. I tend to agree actually and this is totally a sign the data itself should live in stumptown rather than in BCD.

The doc I linked is as I understand it the end of a process to define a spec table, not the beginning.

Okay, I will let Chris comment on the state of the doc, but to me it didn't look so decisive, but maybe I am overly pessimistic. :) The recommendations you summarized make sense to me and maybe our users would agree, too.

I do think we should engage with this document in building spec data into stumptown.

Okay, cool. So, I've presented a "v0" in which has prior art on spec_url in BCD and that's one way to go forward. Let me come back with a comment here when I have thought through an alternative plan that puts spec_urls in stumptown structures (and removes it from BCD). I agree to omit the feedback thing for now and as statuses are another rather unmaintainable thing with (I believe) little use for web developer, I, too, agree to not add it.

After that, I think I need to dive into the questions you raised about the recipe, build-json, and the renderer. This will probably be a lot more straightforward if we have spec_urls directly in stumptown but I will attempt to think it through for the alternative solution with BCD spec_urls, too.

I appreciate some hand-holding as I go forward because this ticket firstly seemed to me like "hey Florian, solve spec tables on MDN all and forever!" and that is totally overwhelming. So, I thought the BCD v0 work could be an easy escape path for the moment. I'm glad we're talking through options now before we dive into implementing something, though :)

@chrisdavidmills
Copy link

OK, so chiming in here.

I wrote the spec tables doc after many discussions with various people about their inherent problems. Mainly

  • Confusing status, which seems different to what the status is on the actual spec linked to
  • Their Manual nature being unmaintainable.
  • The notes column being largely useless
  • The need for a feedback link to allow people to give feedback on nascent specs directly from their MDN pages.

However, my earlier work and this doc did not take two things into account:

  1. Stumptown
  2. How often the spec tables are actualy used, and how useful the feedback link would be.

Since I did this work, we've had no mentions or complaints of feedback links not being included (even from the folks who proposed it).

I was also an advocate of putting the spec links in the BCD, but probably because there didn't seem anywhere else for them to go.

But I really like the idea of putting them in the stumptown data. I think that we probably just need the name and URL for each spec, possibly with an option notes value, because there are sometimes useful notes like "this spec version introduced the blah blah property".

I think both status and feedback links are red herrings. Anyone desparate enough to want those can find them by going to the spec itself. And browser implementation/stability is a better indicator of status than status, if you get what I mean.

@chrisdavidmills
Copy link

The spec tables as they currently stand are a bit pants, and unmaintainable, but they are not inherently broken and a problem that needs fixing right now. Another thing that can go on the "wait till Stumptown" list.

@Elchi3
Copy link
Member

Elchi3 commented Oct 2, 2019

Thanks Chris!

To me it sounds like the most non-controversial thing to do would be to just get spec_url in, so that we can link to the most recent spec and leave out all other red herrings for now. We can then later come back and enrich it at will with notes, comments, feedback, status, or anything else.

I think Mike has done exactly this with spec_url in BCD. It would probably be similarly easy to adapt his work to add spec_urls to stumptown if it were stable enough or if we would just have a similiar data point for him to update programmatically. So, I guess, it could be as simple as adding the (bcd defined) spec_url as a property in the front matter in stumptown docs.

Or, we stick to spec_url in BCD for now and populate it to the main stumptown packaged json object by resolving it from BCD. I guess it also comes down to authoring convenience. Do we think it is more straight forward for us to maintain it in BCD (with Mike's tooling) or do we want to maintain spec_url is in the front matter of the docs in stumptown (and get Mike's tooling on there later).

@wbamberg
Copy link
Author

wbamberg commented Oct 2, 2019

Thanks for the updates @chrisdavidmills . That makes a lot of sense.

I think that we probably just need the name and URL for each spec

The spec_url in BCD doesn't include a name, do we think this is important? I think it makes the links more readable: CSS Grid Layout versus https://drafts.csswg.org/css-grid. In other bits of stumptown we call this title.

One question is: do we include fragments in the URLs? MDN and BCD both do, currently.

So in the front matter it might look like:

---
title: "grid-column"
mdn_url: https://developer.mozilla.org/en-US/docs/Web/CSS/grid-column
specification:
    title: "CSS Grid Layout"
    spec_url: https://drafts.csswg.org/css-grid/#propdef-grid-column
---

I quite like this: it seems quite compact and readable. One thing to consider is: any time there's a lot of duplication, we might consider whether we should do what MDN currently does and have a separate database of specs, that stumptown indexes into? For instance here, title is going to be duplicated for all the Grid properties. So you could separate out all the spec data and point into it from the front matter:

---
title: "grid-column"
mdn_url: https://developer.mozilla.org/en-US/docs/Web/CSS/grid-column
specification:
    spec_id: css_grid_spec
    fragment: propdef-grid-column
---

...and spec_id can be used to retrieve title, url and potentially other stuff from a separate specdata thing. This seems less usable for humans though.

possibly with an option notes value, because there are sometimes useful notes like "this spec version introduced the blah blah property".

Notes are tricky. So far I have resisted including any markup in the front matter. There is a little localizable text (notable title) but nothing really like prose. So if we wanted to do this I think we might look at a separate MD file for notes about specifications. Or something.

Do we think it is more straight forward for us to maintain it in BCD (with Mike's tooling) or do we want to maintain spec_url is in the front matter of the docs in stumptown (and get Mike's tooling on there later).

It doesn't seem very different either way.

(Obviously), stumptown is not ready for people to use yet. So if people are using this data from BCD, or are intending to start using it soon, then we have to continue to support it in BCD whether or not we also have it in stumptown. Just like with other content. But I feel that if we think it should live in stumptown (and it seems we do), then we should have it in stumptown for the experiment.

@Elchi3
Copy link
Member

Elchi3 commented Oct 7, 2019

Thanks for your feedback, Will!

The spec_url in BCD doesn't include a name, do we think this is important? I think it makes the links more readable: CSS Grid Layout versus https://drafts.csswg.org/css-grid. In other bits of stumptown we call this title.

I think a title is nice to render and I think we should have it, but I'm not sure if we explicitly and repeatedly need to write it down in the front-matters of all pages. Do you think this is needed?

...and spec_id can be used to retrieve title, url and potentially other stuff from a separate specdata thing. This seems less usable for humans though.

I think spec_id is similar duplication in the front-matters. Wouldn't it be possible to have a spec mapping from the domains? Like, in all the front-matters you provide the spec_url much like you do in BCD now:

---
title: "grid-column"
mdn_url: https://developer.mozilla.org/en-US/docs/Web/CSS/grid-column
spec_url: https://drafts.csswg.org/css-grid/#propdef-grid-column
---

And then there is a "specdata thing" with a mapping domain<->title.

{
  "drafts.csswg.org/css-grid": "CSS Grid",
  "tc39.es/ecma262": "ECMAScript 262 Language Specification",
  "tc39.es/ecma402/": "ECMAScript 402 Internationalization API",
}

One question is: do we include fragments in the URLs? MDN and BCD both do, currently.

In BCD, fragments are even required to make it a valid spec_url. Linting fails if you don't provide it.
In fact, the exact definition is: "An optional URL or array of URLs, each of which is for a specific part of a specification in which this feature is defined. Each URL must contain a fragment identifier."
(the pattern is "^http(s)?:\/\/[^#]+#.+").

It can appear as a single spec_url

"spec_url": "https://tc39.es/ecma262/#sec-date.prototype.toisostring"

or as an array (this is for when a feature is defined in multiple specs, but it is still always pointing to the latest specs. An array is not intended to link to historical specs. Historical specs are completely banned from this):

"spec_url": [
  "https://tc39.es/ecma262/#sec-date.prototype.tolocalestring",
  "https://tc39.es/ecma402/#sec-Date.prototype.toLocaleTimeString"
],

As I've designed this with Mike for BCD, I would be in favor to do exactly the same in stumptown, but I'm open to hear how to make this even better. However, right now, I'm not convinced there is advantages from explicitly stating title, spec_id and/or fragment over just providing a (n array of) spec_url.

@chrisdavidmills
Copy link

The discussion here is going really well. I've seen nothing I disagree with, and don't have any other burning thoughts for now, but let me know if there's anything specific you want my input on.

@Elchi3
Copy link
Member

Elchi3 commented Oct 7, 2019

Another reason I don't like spec_id, is maintenance. In KumaScript, I always found it messy to have this spec_id in the macro calls. See mdn/kumascript#220 for how it was quite an adventure to get things updated.

tl;dr: In our specdata in KumaScript we have the spec urls in a specdata macro and in all the pages we have calls like {{SpecName('Web Storage', '#dom-storage-key', 'key()')}}. Now, when you want to update the URL you do that in the specdata macro, but then you also have to double check all the callers and make sure the fragment identifiers still point to the correct things. I'm afraid this would be similar with:

---
specification:
    spec_id: css_grid_spec
    fragment: propdef-grid-column
---

where you've now introduced two places as well that need updating upon spec re-arrangements.

@wbamberg
Copy link
Author

wbamberg commented Oct 7, 2019

This sounds good to me.

I think readability in the source is very important. When I said spec_id "seems less usable for humans" I think that's a big mark against it. I was just trying out alternatives :).

If you don't want to duplicate things like title everywhere (and I agree you don't) then you need something to be a key into a table of specs. In your proposal you're using the URL directly, and that looks better to me.

when you want to update the URL you do that in the specdata macro, but then you also have to double check all the callers and make sure the fragment identifiers still point to the correct things

This seems to be still the case though. If the URL changes, and you're using the URL as the key, then you have to update everything (all the spec_urls and the entry in specdata).

But still the comprehensibility of using URLs directly makes them much nicer, ISTM.

@Elchi3
Copy link
Member

Elchi3 commented Oct 8, 2019

Last updated: 21st Oct, 2019 (spec_url is no more, it is now called specifications)

Thanks Will!

It sounds we're decided then. I'm trying to summarize the proposal and outline next steps. Let me know if this plan sounds accurate to you all.

Specification links in stumptown reference docs

Specifications are added to stumptown MDN docs by using specifications in the front matter of reference pages. Examples:

Feature defined in a single spec:

---
title: "grid-column"
mdn_url: https://developer.mozilla.org/en-US/docs/Web/CSS/grid-column
specifications: https://drafts.csswg.org/css-grid/#propdef-grid-column
---

Feature defined in multiple specs:

---
title: "Date.prototype.toLocaleString()"
mdn_url: Web/JavaScript/Reference/Global_Objects/Date/toLocaleString
specifications: 
  - https://tc39.es/ecma262/#sec-date.prototype.tolocalestring
  - https://tc39.es/ecma402/#sup-date.prototype.tolocalestring
---

specifications is defined as "An optional URL or array of URLs, each of which is for a specific part of a specification in which this feature is defined. Each URL must contain a fragment identifier."
(the pattern is "^http(s)?://[^#]+#.+").

Historical specs aren't added. To control which specification domains are allowed (those that qualify as relevant and non-historical) and to add human readable titles for the specs, a spec data definition file is added. It looks like this:

drafts.csswg.org/css-grid": 'CSS Grid'
tc39.es/ecma262: 'ECMAScript 262 Language Specification'
tc39.es/ecma402: 'ECMAScript 402 Internationalization API'
html.spec.whatwg.org: 'WHATWG HTML Living Standard'
w3c.github.io/webappsec-cspee: 'Content Security Policy: Embedded Enforcement'
w3c.github.io/webappsec-feature-policy: 'Feature Policy'
wicg.github.io/priority-hints: 'Priority Hints'

Non-standard features

If a feature has no spec, you explicitly state so like this:

---
specifications: non-standard
---

To-dos

Things we won't do

  • Have historical specs from current MDN added
  • Have spec comments added
  • Have spec status listed
  • Have a spec feedback mechanism added

@wbamberg
Copy link
Author

wbamberg commented Oct 8, 2019

Yes, this looks great to me.

Add "spec data" json (where? what to call it? how to expose it?)

The way we've handled this kind of thing so far is for stumptown-content to take care of this, so stumptown-renderer doesn't have to care. So build-json could see spec_url and use the value as a key into specdata.json, look up the spec name from there, and write URL and name into the built JSON.

I think it would be OK to keep specdata.json in stumptown-content. It's (presumably) primarily edited by MDN editors, so it seems helpful to keep it in the same place(?).

@Elchi3
Copy link
Member

Elchi3 commented Oct 17, 2019

The way we've handled this kind of thing so far is for stumptown-content to take care of this, so stumptown-renderer doesn't have to care. So build-json could see spec_url and use the value as a key into specdata.json, look up the spec name from there, and write URL and name into the built JSON.

@wbamberg I think I've only understood just now what you mean here. I think you mean that the front-matter spec_url from abbr.md should not go as is into the packaged abbr.json but instead is enhanced with the information provided in specifications.json and is added with that to abbr.json then.

Before realizing this, I thought about specifications.json as data we want to expose to any renderer or stumptown consumer and they would use it then as they wish. So, I thought, it would make sense to create content/data/specifications.json which is then bundled to packaged/data/specifications.json. In this data folder (and thus the packaged data folder), we would then add any data that is relevant and it would be directly packaged. I've followed this idea in this branch: https://github.com/Elchi3/stumptown-content/commit/6df61e57eb873a214f83138880767eb5d7152162

But given what you said in the comment above and that I just realized now, this might not make sense as you say we would rather prepare data and add it to the relevant spots in our page jsons as opposed to exposing it raw in a data directory. (Exposing this raw data in addition to having it added to the page json would be an option, too, I guess, but then if nothing is using it, it won't age well.)

I would appreciate some thoughts, so that I know if I'm on the correct way of doing this.

@wbamberg
Copy link
Author

I think you mean that the front-matter spec_url from abbr.md should not go as is into the packaged abbr.json but instead is enhanced with the information provided in specifications.json and is added with that to abbr.json then.

Yes, that is what I mean. So far we've tried to present something to the renderer that it can use without having to do a bunch of cross-referencing with other data sources. For example, take BCD. The content as authored specifies BCD as just the query string "html.elements.input". We could just write that directly into the JSON, but we don't: we fetch the actual BCD JSON and write that into the page JSON, so the renderer doesn't have to do that dance, and so each page JSON is self-contained.

Of course it doesn't have to be this way, but that's been our approach so far.

Similarly, when you said:

I think a title is nice to render and I think we should have it, but I'm not sure if we explicitly and repeatedly need to write it down in the front-matters of all pages. Do you think this is needed?

...I agree, but it seems like the cost of this repetition is mostly borne by authors, so it's for the benefit of authors that we could abstract it in a separate specs.json file. As far as the built JSON is concerned, it's not meant for humans, just for machines, and they (it is assumed) would prefer the repetition, since it makes their code simpler and more direct (the only drawback is the size of the files, which might turn out to be a problem, or not).

So there are really two interfaces: stumptown-content presents an authoring interface for humans, and build-json converts that into a JSON interface for machines.

@Elchi3
Copy link
Member

Elchi3 commented Oct 18, 2019

So far we've tried to present something to the renderer that it can use without having to do a bunch of cross-referencing with other data sources.
We could just write that directly into the JSON, but we don't: we fetch the actual BCD JSON and write that into the page JSON, so the renderer doesn't have to do that dance, and so each page JSON is self-contained.

Cool! I can see how this is very useful for the renderer and actually for anyone who consumes the page jsons as it is all there. As we know from experience with VS Code, it doesn't matter too much how we expose things, so I think if we're consistent and make the renderer's life easy, then the initial packaged JSON or the "stumptown content API" is good enough.

So there are really two interfaces: stumptown-content presents an authoring interface for humans, and build-json converts that into a JSON interface for machines.

Thanks for the conceptual explanation! It all makes sense to me. I guess I'm living too much in the BCD world where authoring interface is at the same time the JSON interface for machines.
But I think this is a nice design principle behind stumptown that we should write down somewhere if it isn't already. I guess BCD would have benefited from that as well, although most people seem to be okay with editing JSON.

I will try to implement spec_url for the machines then :) Sorry to be slow here, there is still a bit of new stumptown ground here that I need to familiarize with.

@jmswisher
Copy link

Moving this to Sprint 2. Now that the discussion has clarified things, does this need to be broken into smaller stories for implementation?

@wbamberg
Copy link
Author

Thanks Janet.

This is very close to done, so no (I expect it will be closed this week actually).

@wbamberg
Copy link
Author

@Elchi3
Copy link
Member

Elchi3 commented Oct 24, 2019

Potential follow-ups:

  • Validate specifications according to "^http(s)?://[^#]+#.+" when the linter has landed.
  • Document in stumptown writer's guide how specs work in the new world.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants