Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spark] Fix dependent constraints/generated columns checker for type widening #3912

Conversation

Alexvsalexvsalex
Copy link
Contributor

Which Delta project/connector is this regarding?

  • Spark
  • Standalone
  • Flink
  • Kernel
  • Other (fill in here)

Description

The current checker of dependent expressions doesn't validate changes for array and map types. For example, usage of type widening could lead to constraints breaks:

scala> sql("CREATE TABLE table (a array<byte>) USING DELTA")
scala> sql("INSERT INTO table VALUES (array(1, -2, 3))")
scala> sql("SELECT hash(a[1]) FROM table").show()
+-----------+
| hash(a[1])|
+-----------+
|-1160545675|
+-----------+

scala> sql("ALTER TABLE table ADD CONSTRAINT ch1 CHECK (hash(a[1]) = -1160545675)")
scala> sql("ALTER TABLE table SET TBLPROPERTIES('delta.enableTypeWidening' = true)")
scala> sql("ALTER TABLE table CHANGE COLUMN a.element TYPE BIGINT")
scala> sql("SELECT hash(a[1]) FROM table").show()
+----------+
|hash(a[1])|
+----------+
|-981642528|
+----------+

scala> sql("INSERT INTO table VALUES (array(1, -2, 3))")
24/11/15 12:53:23 ERROR Utils: Aborting task
com.databricks.sql.transaction.tahoe.schema.DeltaInvariantViolationException: [DELTA_VIOLATE_CONSTRAINT_WITH_VALUES] CHECK constraint ch1 (hash(a[1]) = -1160545675) violated by row with values:

The proposed algorithm is more strict and regards maps, arrays and structs during constraints/generated columns dependencies.

How was this patch tested?

Added new tests for constraints and generated columns used with type widening feature.

Does this PR introduce any user-facing changes?

Due to strictness of the algorithm new potential dangerous type changes will be prohibited. An exception will be thrown in the example above.
But such changes are called in the schema evolution feature mostly that was introduced recently, so it should not affect many users.

Copy link
Collaborator

@johanl-db johanl-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

It should be fairly safe to ship with the fallback path to the old behavior in case anything goes wrong.

Copy link
Collaborator

@bart-samwel bart-samwel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG, I'll merge it.

@bart-samwel bart-samwel merged commit 2937bc8 into delta-io:master Dec 3, 2024
16 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants