Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Create Attribute Type Rules for Interval Join, Scatter Plot, Sort Par…
…titions, Type Casting (#2005) This is the second PR for the attribute type checking feature. The first one is #1924. ## Description of Attribute Type Rules ### Interval Join `leftAttributeName` (and `rightAttributeName`) must be `integer`, `long`, `double`, or `timestamp`. And, `leftAttributeName` attribute must have the same type as the `rightAttributeName`. ```JSON { "attributeTypeRules": { "leftAttributeName": { "enum": ["integer", "long", "double", "timestamp"] }, "rightAttributeName": { "const": { "$data": "leftAttributeName" } } } } ``` Note: We intentionally put `enum` test in front of `const` test, because we want to test whether they have the correct type. Or, if we put the `const` test first, i.e `rightAttributeName` rule first, and if `leftAttributeName`'s attribute type is an invalid type like `string`, then it will prompt the user that `rightAttributeName` should have the same attribute type as `leftAttributeName` -- `string` -- which is incorrect since both should not be a `string` type. ### Scatter Plot `xColumn` and `yColumn` attributes must be of `integer` or `double` type. ```JSON { "attributeTypeRules": { "xColumn":{ "enum": ["integer", "double"] }, "yColumn":{ "enum": ["integer", "double"] } } } ``` Note: it may support `long` in the future. See #1954. ### Sort Partitions `sortAttributeName` attribute type must be `integer`, `long`, or `double`. ```JSON { "attributeTypeRules": { "sortAttributeName":{ "enum": ["integer", "long", "double"] } } } ``` Note: May support `timestamp` in the future. See #1954. ### Type Casting For example, if we want to convert an attribute to `integer`, it must have attribute type of `string`, `long`, `double`, or `boolean`. A type should not convert to the type itself. See the schema for detail. ```JSON { "attributeTypeRules": { "attribute": { "allOf": [{ "if": { "resultType": { "valEnum": ["integer"] } }, "then": { "enum": ["string", "long", "double", "boolean"] } }, { "if": { "resultType": { "valEnum": ["double"] } }, "then": { "enum": ["string", "integer", "long", "boolean"] } }, { "if": { "resultType": { "valEnum": ["boolean"] } }, "then": { "enum": ["string", "integer", "long", "double"] } }, { "if": { "resultType": { "valEnum": ["long"] } }, "then": { "enum": ["string", "integer", "double", "boolean", "timestamp"] } }, { "if": { "resultType": { "valEnum": ["timestamp"] } }, "then": { "enum": ["string", "long"] } } ] } } } ``` Note: The type constraint is enforced in `core/amber/src/main/scala/edu/uci/ics/texera/workflow/common/tuple/schema/AttributeTypeUtils.scala`. --------- Co-authored-by: Yicong Huang <17627829+Yicong-Huang@users.noreply.github.com>
- Loading branch information