Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ML Model schema definition #1

Closed

Conversation

devisperessutti
Copy link

@rbavery

Here is my attempt at creating the schema.json definition for the ML Model extension.

This is a WIP version, but I'd like to get some early feedback about it (hopefully it somewhat useful). You might see that I'm not that familiar with STAC specs and I'm learning as I go along.

There are open questions marked with comments, and some have associated questions in the specification file.

The main issues I'd need help with are:

  • Parameters Object
  • Statistics Object
  • Asset Object
  • how to properly include and reference other STAC extensions it relies on.

I put this file out of the json-schema folder for now, as it might be easier to review without comparison to the existing dlm. Let me know if you'd rather check the diffs.

@devisperessutti devisperessutti marked this pull request as draft March 6, 2024 16:05
@devisperessutti devisperessutti changed the title WIP: first commit of MLM schema ML Model schema definition Mar 6, 2024
@fmigneault
Copy link
Collaborator

It would be preferable to keep the schema structure proposed by STAC (see other extensions or the one under current DLM) with the top-level oneOf/allOf combination to validate and reuse the fields between the STAC Item and Collection cases.

how to properly include and reference other STAC extensions it relies on.

For this, it should use the official URI of these extensions.
For example, the Model Output Object that needs classification:classes should do something like:

{
  "classification:classes": {
    "$ref": "https://stac-extensions.github.io/classification/v1.1.0/schema.json#fields/properties/classification:classes"
  }
}

Comment on lines +446 to +468
"mlm:properties": { // TODO: update/change these
"type": "object",
"required": [
"properties"
],
"properties": {
"properties": {
"$comment": "Optional metadata that provides more details about provenance.",
"": [
{
"$ref": "https://schemas.stacspec.org/v1.0.0-beta.2/item-spec/json-schema/instrument.json"
},
{
"$ref": "https://schemas.stacspec.org/v1.0.0-beta.2/item-spec/json-schema/licensing.json"
},
{
"$ref": "https://schemas.stacspec.org/v1.0.0-beta.2/item-spec/json-schema/provider.json"
},
{
"$ref": "https://schemas.stacspec.org/v1.0.0-beta.2/item-spec/json-schema/datetime.json"
}
]
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be at the root of the STAC Item and validated by the core schema. There is no need to duplicate them as mlm:properties.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @fmigneault for the feedback

I'd need your guidance here as some things are not clear to me.

This schema follows the current DLM schema, but I see what you mean when looking at the recommended STAC structure (for instance for ml-model extension) using oneOf/allOf for STAC Item schema and STAC Collections schema, with required_fields/common_fields

This means that the definitions here defined also need to be reshuffled into the common_fields?

If I understand correctly I should:

  • use oneOf/allOf to define the schema for the STAC Items. This should also include the official URI of other extensions required ?
  • use oneOf / allOf to define the schema for the STAC Collections
  • reshuffle the definitions into common_fields ?

I'd like to take time to tackle this, but don't want to hinder progress due to my inexperience, so let me know how to best proceed.

I will in the meanwhile add the examples and validate them as per crim-ca#7 (comment)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see now what you mean for the current DLM schema extension https://github.com/rbavery/dlm-extension/blob/4eb30dab98617dfee4cc6b1e5dbc1d50d4770779/json-schema/schema.json

great it's done, I will close this PR then

@rbavery
Copy link
Owner

rbavery commented Mar 19, 2024

@rbavery

Here is my attempt at creating the schema.json definition for the ML Model extension.

This is a WIP version, but I'd like to get some early feedback about it (hopefully it somewhat useful). You might see that I'm not that familiar with STAC specs and I'm learning as I go along.

There are open questions marked with comments, and some have associated questions in the specification file.

The main issues I'd need help with are:

  • Parameters Object
  • Statistics Object
  • Asset Object
  • how to properly include and reference other STAC extensions it relies on.

I put this file out of the json-schema folder for now, as it might be easier to review without comparison to the existing dlm. Let me know if you'd rather check the diffs.

Hi @devisperessutti sorry I missed this PR, it somehow slipped past me in the notifications. I'll review this week!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants