Skip to content
This repository has been archived by the owner on Oct 28, 2024. It is now read-only.

Add a categorical field type [native values version] #62

Closed
wants to merge 13 commits into from

Conversation

khusmann
Copy link
Contributor

@khusmann khusmann commented Apr 23, 2024

Here's a native values version of the categorical proposal, see #48 for a version based on physical/lexical values instead.

Key differences:

  • If the native format supports categorical data types, the field MUST be represented in that native format.

  • The categories prop is now optional, because if the native format already supports categorical variables it's not strictly necessary. (But perhaps we want to require it anyway, as it defines the logical type?)

  • When the native format doesn't support categorical types, it can use any native type instead (as suggested by @pschumm). I'm not quite sure about this; I'd be more comfortable if it was just numeric / string types, but can't think of a specific counter-case at the moment.

  • With native types, the logical representation of levels (for constraints) in the spec becomes much harder to pin down, because categorical no longer has a "base" data type (as @roll observed)... the "logical" type may be integer in some implementations, or closer to string in other implementations. I've implemented @pschumm's solution here of just having the constraint definition match the full level definition.

I think between this and my updates to #48, I've addressed everyone's comments so far... but please let me know if I've overlooked anything!

@roll
Copy link
Member

roll commented Apr 25, 2024

Great work @khusmann !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants