Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add migrated version metadata endpoint #565

Open
dchiquito opened this issue Oct 15, 2021 · 9 comments
Open

Add migrated version metadata endpoint #565

dchiquito opened this issue Oct 15, 2021 · 9 comments
Assignees
Labels
enhancement New feature or request metadata Issues of dandiset/asset metadata handling

Comments

@dchiquito
Copy link
Contributor

Implement https://github.com/dandi/dandi-api/blob/master/doc/design/draft-metadata-migration.md

Add an endpoint GET api/dandisets/{dandiset_pk}/versions/{version}/migrated/ that attempts to migrate the version metadata to the current schemaVersion. If the metadata is already at the current schemaVersion, nothing is done. If the migration fails for whatever reason, simply return the current metadata.

The meditor will use this new endpoint to get the metadata for a version, ensuring that whenever a user saves the meditor content, the schemaVersion is as up to date as possible.

@waxlamp waxlamp added the enhancement New feature or request label Jan 1, 2022
@waxlamp waxlamp added the metadata Issues of dandiset/asset metadata handling label Mar 8, 2023
@waxlamp waxlamp added this to the Dandiset lifecycle milestone Mar 8, 2023
@jjnesbitt
Copy link
Member

jjnesbitt commented Mar 27, 2023

@satra @yarikoptic, regarding the proposed implementation, I think trying to automagically migrate the user's metadata to the new schema is a mistake. Since this is a publishing concern (i.e. we block publish if the version metadata is not one of the allowed schema versions), I think it makes since to raise the issue at publish, and provide ways for the user to migrate the data. The situation I'm trying to avoid is silently migrating the version metadata out from under the user, they don't notice, and the metadata is now incorrect (not invalid) because of it.

My thought was that if their version metadata is unpublishable, we would have a button on the DLP that says something like "Upgrade dandiset metadata", and that would call this new metadata migration endpoint. It would update the metadata and open the meditor for the user to view the new migrated data. They can then look at everything to ensure it makes sense, and save it. Now their dandiset is publishable again, and they've manually approved the metadata.

@satra
Copy link
Member

satra commented Mar 27, 2023

@AlmightyYakob - just to clarify that this particular issue is explicitly about the DLP/dandiset metadata, which is only editable through the meditor (or if someone uses the API directly). It was also there to provide a route where the DLP does not need to consider metadata versions. it is possible that migration fails, but that should be something we should handle as part of some CI.

while i would want to ensure that our current schemas are always applicable, this particular scenario is about dandiset metadata, while the other publishing issue was more related to asset metadata, which is a different issue, not this one.

@jjnesbitt
Copy link
Member

@AlmightyYakob - just to clarify that this particular issue is explicitly about the DLP/dandiset metadata, which is only editable through the meditor (or if someone uses the API directly). It was also there to provide a route where the DLP does not need to consider metadata versions. it is possible that migration fails, but that should be something we should handle as part of some CI.

while i would want to ensure that our current schemas are always applicable, this particular scenario is about dandiset metadata, while the other publishing issue was more related to asset metadata, which is a different issue, not this one.

I don't understand your point, I was not referring to asset metadata. I was referring to this design doc, which exclusively refers to dandiset/version metadata.

@yarikoptic
Copy link
Member

In general I like the idea of explicit auto-metadata migration @AlmightyYakob upon user request! It could visualize some kind of a diff to the user etc. Also it should be then a record in the perspective audit DB (#525). We have the design doc you refer to and I am not entirely clear on what is the discussion exactly about here ;)

@satra
Copy link
Member

satra commented Mar 28, 2023

setting aside direct api calls, most users will use the meditor to make changes to the metadata. thus auto upgrade of the schema version is completely within our control, the user doesn't even see it. and i don't think we want to create a version of the meditor that is responsive to multiple schemas.

@yarikoptic - i don't know what it would mean to ask the user to update metadata according to changes in schema version. why would it be relevant to an end user when they only see options in a visual interface and have no concept a schema let alone a version?

@yarikoptic
Copy link
Member

@yarikoptic - i don't know what it would mean to ask the user to update metadata according to changes in schema version. why would it be relevant to an end user when they only see options in a visual interface and have no concept a schema let alone a version?

Some thoughts

  • In typical use cases indeed it should of no interest to the user! But user might appreciate knowing that "there is a process which now changes the record which I previously entered", so if user discovers some oddity after metadata upgrade, the user would not question his/her sanity on why she entered some oddly looking data which we managed to misformat or anyhow modify/or misassign in our upgrade.
  • If e.g. schema defines new mandatory fields that interaction with the user would be the only way to fulfill it and would make sense in "oh, now it is required".
  • It is end-user who entered metadata before and some might be interested to see what new metadata supports now, e.g. may be some metadata got dedicated new fields for which they used some more loosely defined before and they would prefer to tune it up.

@satra
Copy link
Member

satra commented Mar 29, 2023

i've not heard any convincing argument thus far of why this should be made into a user clickable experience. please do consider the UX of a scientist (not a software developer who understands schemas). also consider that the meditor itself has to support multiple versions of the schema.

my vote would be to auto upgrade, but i leave this now to you and @AlmightyYakob to decide. whatever you decide, i will insist that we don't want a dandiset publishable with an older schema version of the metadata than the current one.

@yarikoptic
Copy link
Member

@kabilar let's plan to have this addressed

@yarikoptic
Copy link
Member

yarikoptic commented Jan 22, 2025

Echoing @satra I think we might want to just add a flag migrate=False to /dandisets/{dandiset__pk}/versions/{version}/ (dandiset_versions_read), by default, efficiently return metadata without migration, but whenever desired - e.g. by meditor - that client would state migrate=True and where possible migrated record would be returned.

Without this, we kept pretending to manipulate current version of metadata while keeping schemaVersion in the records as old as when they were initially created:

dandi@drogon:/mnt/backup/dandi/dandisets$ grep -h schemaVersion */dandiset.yaml  | sort | uniq -c
      8 schemaVersion: 0.4.4
    139 schemaVersion: 0.6.0
     26 schemaVersion: 0.6.2
     85 schemaVersion: 0.6.3
    311 schemaVersion: 0.6.4
     12 schemaVersion: 0.6.6
    111 schemaVersion: 0.6.7
    109 schemaVersion: 0.6.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request metadata Issues of dandiset/asset metadata handling
Projects
None yet
Development

No branches or pull requests

5 participants