Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Context Processing Algorithm 5.2.4 #150

Merged
merged 1 commit into from
Apr 21, 2021
Merged

Fix Context Processing Algorithm 5.2.4 #150

merged 1 commit into from
Apr 21, 2021

Conversation

skodapetr
Copy link
Contributor

The specification https://www.w3.org/TR/json-ld11-api/#algorithm step 5.2.4 states:

If context was previously dereferenced, then the processor MUST NOT do a further dereference, and context is set to the previously established internal representation: set context document to the previously dereferenced document, and set loaded context to the value of the @context entry from the document in context document.

The current implementation is not according the specification as caching is done on level of context not context document as stated by the specification. This becomes an issue if the same document, remote context, is loaded multiple times.

In the current implementation, where the whole context in cached I got error on following example:

{
  "@context": [
    {
      "@version": 1.1,
      "Pracovní místo": {
        "@id": "https://slovník.gov.cz/datový/pracovní-místa/pojem/pracovní-místo",
        "@context": {
          "kvalifikace": {
            "@id": "https://slovník.gov.cz/datový/pracovní-místa/pojem/požadovaná-kvalifikace",
            "@context":  "http://example.com/práce.json"
          }
        }
      }
    },
    {
      "@version": 1.1,
      "věda": "https://slovník.gov.cz/generický/věda-a-výzkum/pojem/",
      "Výzkumné pracoviště": {
        "@id": "věda:ResearchOrganization",
        "@context": [
         "http://example.com/práce.json",
          {
            "orjk": {
              "@id": "věda:orjk"
            }
          }          
        ]
      }
    }
  ],
  "typ": "Pracoviště",
  "iri": "https://data.mff.cuni.cz/zdroj/číselník/organizační-struktura/oddělení/204"
}

where it fail to resolve věda:orjk.

The reason is that before loading the context with this class the remote context from http://example.com/práce.json is loaded from the cache. The issue as that the context was resolved using the first context, the one with Pracovní místo and thus there is no definition of věda.

The solution is to simple cache the JSON documents, as I believe the specification states. Then then loading the remote context from http://example.com/práce.json using the cache, it is properly resolved with the current parent. As a result the term věda is available for resolution of věda:orjk.

@filip26
Copy link
Owner

filip26 commented Apr 21, 2021

Hi,
thank you for the PR. The motivation for caching processed contexts is to save time and do not repeat the same computation multiple times. As far as I understand the issue you ran into, the current caching implementation introduces an unwanted side effect - a bug.

Feel free to open issues, or other PRs.

@filip26 filip26 merged commit 9abc55a into filip26:main Apr 21, 2021
@jakubklimek
Copy link

@filip26 thanks for merging this. Would you consider releasing this bugfix also to Maven?

@filip26
Copy link
Owner

filip26 commented Apr 21, 2021

Good question, there have been more changes since 1.0.0. Let's plan 1.0.3 release on Friday 4/23.

@filip26 filip26 added this to the 1.0.3 milestone Apr 22, 2021
@filip26 filip26 linked an issue Apr 22, 2021 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

INVALID_IRI_MAPPING when using specific context in form of an array
3 participants