Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-36803: Initial support for setting metadata through technote.toml #3

Merged
merged 10 commits into from
Nov 4, 2022

Conversation

jonathansick
Copy link
Member

@jonathansick jonathansick commented Nov 2, 2022

This PR adds Pydantic models that specify a technote.toml file that technotes can use to specify bibliographic metadata and technical build configuration.

This schema also includes Pydantic types for ORCiD and ROR IDs that validate those IDs. Also support for validating licenses by SPDX identifier. This is adopted from Lander (https://github.com/lsst-sqre/lander), and indeed, there could be a case for creating a core metadata library that both Lander and Technote share.

Here's a sample technote.toml:

[technote]
id = "SQR-000"
series = "SQR"
date_created = "2022-10-31"
date_updated = "2022-10-31"
canonical_url = "https://sqr-000.lsst.io/"
github_url = "https://github.com/lsst-sqre/sqr-000"
license = {id = "CC-BY-4.0"}
version = "1.0.0" # or omit version to not version

[[technote.authors]]
orcid = "https://orcid.org/0000-0003-3001-676X"
name = {family_names = "Sick", given_names = "Jonathan"}
internal_id = "sickj"

[[technote.contributors]]
name = {name = "Frossie"}
role = "Editor"

This code is largely ported from Lander,
https://github.com/lsst-sqre/lander (in fact, we might want to look at
creating a bibliographic metadata package, like a modern version of
lsst-projectmeta-kit, but generalized for community standards).

This code interfaces to the SPDX license list and in fact vendors the
3.18 version of the license list data from
https://github.com/spdx/license-list-data/tree/v3.18/json

This will allow us to canonically identify content licenses for
technotes.
These are largely ported from Lander,
https://github.com/lsst-sqre/lander, and enable us to validate ORCiD and
ROR identifiers in bibliographic metadata.

Note I had to disable the error-on-warnings for the doc build because I
couldn't figure out how to avoid the type analysis error on the unicode
type in these pydantic types.
This controlled vocabulary comes from the Zenodo docs:
https://developers.zenodo.org/#representation

It enables us to use a consistent vocabulary for referring to non-author
contributions to a technote.
Technotes are configured with a technote.toml file, in keeping with our
adoption of TOML for user-friendly configuration (and consistent with
the direction Python is going). Right now this toml configuration mostly
covers the basic bibliographic metadata needs for technote, though we'll
also need to expand the configuration scheme to handle technical
configuration for the Sphinx project.

Overall the technote.toml treats bibliographic metadata as optional.
This is keeping with the desire to make rich metadata something that you
can opt into and then the document will express that data to
end-users... but if you just want to start writing all you need to do is
start a document, write, and push to GitHub.
https://autodoc-pydantic.readthedocs.io/en/stable/index.html

This extends autodoc by specifically documenting Pydanic types. It seems
to be working well, so I think the next step is to move this into
documenteer[guide].
I looked into validating the DOI via a regular expression but decided
the regex may not be reliable enough given complexity in the DOI
spec..., so we'll rely on users to copy-and-paste reliably. We can
revisit validation strategies in the future as needed.
This status is intended for communicating the overall status of a
document, like whether it is in planning, being actively written,
stable, or is deprecated. The status isn't intended for fine-grained
states like in review, etc.
The names and docstrings hopefully make it clearer for using the API
docs to help when writing technote.toml configuration.
@jonathansick
Copy link
Member Author

Includes support for describing a technote's state, including deprecating a technote.

CleanShot 2022-11-04 at 13 07 31

CleanShot 2022-11-04 at 13 08 26

@jonathansick jonathansick marked this pull request as ready for review November 4, 2022 17:09
@jonathansick jonathansick merged commit 35894ab into main Nov 4, 2022
@jonathansick jonathansick deleted the tickets/DM-36803 branch November 4, 2022 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant