-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLDR-16836 kbd: add EBNF to spec for transforms #4261
Conversation
121aab0
to
afa4af5
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
fyi @stasm and @aphillips |
|
a461f03
to
308448c
Compare
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
- add keyboard abnf and sample files and automated tests Temporarily skip non-BMP chars, see hildjj/node-abnf#25 which is being fixed.
308448c
to
d38f212
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
- The EBNF links should point to https://unicode.org/reports/tr35/#ebnf. (Since the LDML EBNF is a superset of the vanilla EBNF that should work.
- The validity / well-formedness constraints (eg no more than 9 capture groups), should use the w3c syntax.
Thanks!
I will add this: The following is the [LDML EBNF](./tr35.md#ebnf) format for the grammar:
The 9 capture groups is 9 inner capture groups. So for example, valid: /(first)(second)(third)(fourth)(fifth)(sixth)(seventh)(eighth)(?:And possibly, (ninth))?/ but invalid: /(first)(second)(third)(fourth)(fifth)(sixth)(seventh)(eighth)(?:And possibly, (ninth))?(?:But not, (tenth)!)?/ because it's a nested group it may be challenging to express in the grammar, it's more like a resource (slot) limit. |
199f93e
to
c0e32b4
Compare
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
I'd like to get some Java based tooling, but that will be in a separate PR. |
docs/ldml/tr35-keyboards.md
Outdated
| DIGIT | ||
| '_' | ||
ASCII-CTRLS | ||
::= [#x1-#x8#xB-#xC#xE-#x1F] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are not allowed in XML 1.0
Yes, my comment about surrogates is redundant because they are already
covered in the BNF. As to errors, what I was trying to express is that an
implementation would reject a keyboard layout with an error… But I can
refer to that as a constraint for consistency
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I requested changes for the constraints.
- remove illegal ctrls - reorder members - cleanup
docs/ldml/tr35-keyboards.md
Outdated
```ebnf | ||
[ wfc: No more than 9 capture groups may be present. ] | ||
[ vc: all variables referenced must be defined in the <variables> element ] | ||
[ vc: The CLDR repository may define additional constraints on the repertoire, such as requiring all characters to be in a published Unicode version and disallowing private-use characters. ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[ vc: The CLDR repository may define additional constraints on the repertoire, such as requiring all characters to be in a published Unicode version and disallowing private-use characters. ] | |
[ vc: If a keyboard definitions is submitted to the the CLDR repository, it must satisfy additional constraints on the character repertoire. For more information, see [CLDR keyboard repertoire constraints](#repertoire-constraints). ] |
We have to point to where those constraints are documented. So add a little section header for that and point to it. Also make a change to a similar line below.
I suggest that the contents of that section be:
- No characters can have any of the following General_Category values in the latest version of the Unicode Standard:
- Private Use (Co)
- Surrogate (Cs)
- Unassigned (Cn)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure we can link from ebnf to spec.
Also the repository requirements are a separate activity - I don't see a reason to specify them here in detail (though I'll take the list back to kbd-wg). Perhaps it's better to just say- see a future version of the spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about this
[ vc: If a keyboard definitions is submitted to the the CLDR repository, it must satisfy additional constraints on the character repertoire. ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with that is there is no way for the reader to find out what those requirements are — we can't have undefined vc requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I'd propose to drop it from vc and just note. Or not even note. We don't have the requirements yet.
Yes there's no way to find out the requirements because they are future.
@miloush how does it look? |
CLDR-16836
ALLOW_MANY_COMMITS=true