Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make the "content*" keywords annotations only #767

Merged
merged 2 commits into from
Aug 9, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 42 additions & 32 deletions jsonschema-validation.xml
Original file line number Diff line number Diff line change
Expand Up @@ -776,18 +776,19 @@

<section title="Foreword">
<t>
Properties defined in this section indicate that an instance contains
Annotations defined in this section indicate that an instance contains
non-JSON data encoded in a JSON string.
They describe the type of content and how it is encoded.
</t>
<t>
These properties provide additional information required to interpret JSON data
as rich multimedia documents.
as rich multimedia documents. They describe the type of content, how it is encoded,
and/or how it may be validated. They do not function as validation assertions;
a malformed string-encoded document MUST NOT cause the containing instance
to be considered invalid.
</t>
<t>
Meta-schemas that do not use "$vocabulary" SHOULD be considered to
require this vocabulary as if its URI were present with a value of true,
although see the Implementation Requirements below for details.
require this vocabulary as if its URI were present with a value of true.
</t>
<t>
The current URI for this vocabulary, known as the Content vocabulary, is:
Expand All @@ -801,16 +802,35 @@

<section title="Implementation Requirements">
<t>
The content keywords function as both annotations and as assertions.
While no special effort is required to implement them as annotations conveying
how applications can interpret the data in the string, implementing
validation of conformance to the media type and encoding is non-trivial.
Due to security and performance concerns, as well as the open-ended nature of
possible content types, implementations MUST NOT automatically decode, parse,
and/or validate the string contents by default. This additionally supports
the use case of embedded documents intended for processing by a different
consumer than that which processed the containing document.
</t>
<t>
All keywords in this section apply only to strings, and have no
effect on other data types.
</t>
<t>
Implementations MAY support the "contentMediaType" and "contentEncoding"
keywords as validation assertions.
Should they choose to do so, they SHOULD offer an option to disable validation
for these keywords.
Implementations MAY offer the ability to decode, parse, and/or validate
the string contents automatically. However, it MUST NOT perform these
operations by default, and MUST provide the validation result of each
string-encoded document separately from the enclosing document. This
process SHOULD be equivalent to fully evaluating the instance against
the original schema, followed by using the annotations to decode, parse,
and/or validate each string-encoded document.
<cref>
For now, the exact mechanism of performing and returning parsed
data and/or validation results from such an automatic decoding, parsing,
and validating feature is left unspecified. Should such a feature
prove popular, it may be specified more thoroughly in a future draft.
</cref>
</t>
<t>
See also the <xref target="security">Security Considerations</xref>
sections for possible vulnerabilities introduced by automatically
processing the instance string according to these keywords.
</t>
</section>

Expand Down Expand Up @@ -841,29 +861,18 @@
<t>
The value of this property MUST be a string.
</t>

<t>
The value of this property SHOULD be ignored if the instance described is not a
string.
</t>

</section>

<section title="contentMediaType">
<t>
If the instance is a string, this property defines the media type
If the instance is a string, this property indicates the media type
of the contents of the string. If "contentEncoding" is present,
this property describes the decoded string.
</t>
<t>
The value of this property MUST be a string, which MUST be a media type,
as defined by <xref target="RFC2046">RFC 2046</xref>.
</t>

<t>
The value of this property SHOULD be ignored if the instance described is not a
string.
</t>
</section>

<section title="contentSchema">
Expand All @@ -876,8 +885,7 @@
JSON Schema's data model.
</t>
<t>
The value of this property SHOULD be ignored if the instance described is not a
string, or if "contentMediaType" is not present.
The value of this property SHOULD be ignored if "contentMediaType" is not present.
</t>
</section>

Expand All @@ -897,8 +905,8 @@
]]>
</artwork>
<postamble>
Instances described by this schema should be strings, and their values
should be interpretable as base64-encoded PNG images.
Instances described by this schema are expected to be strings,
and their values should be interpretable as base64-encoded PNG images.
</postamble>
</figure>

Expand All @@ -915,8 +923,9 @@
]]>
</artwork>
<postamble>
Instances described by this schema should be strings containing HTML, using
whatever character set the JSON string was decoded into. Per section 8.1 of
Instances described by this schema are expected to be strings containing HTML,
using whatever character set the JSON string was decoded into.
Per section 8.1 of
<xref target="RFC8259">RFC 8259</xref>, outside of an entirely closed
system, this MUST be UTF-8.
</postamble>
Expand Down Expand Up @@ -1100,7 +1109,7 @@
</section>
</section>

<section title="Security Considerations">
<section title="Security Considerations" anchor="security">
<t>
JSON Schema validation defines a vocabulary for JSON Schema core and concerns all
the security considerations listed there.
Expand Down Expand Up @@ -1276,6 +1285,7 @@
<t>Moved "definitions" to the core spec as "$defs"</t>
<t>Moved applicator keywords to the core spec</t>
<t>Renamed the array form of "dependencies" to "dependentRequired", moved the schema form to the core spec</t>
<t>Specified all "content*" keywords as annotations, not assertions</t>
<t>Added "contentSchema" to allow applying a schema to a string-encoded document</t>
<t>Also allow RFC 4648 encodings in "contentEncoding"</t>
<t>Added "minContains" and "maxContains"</t>
Expand Down