-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-29680][SQL] Remove ALTER TABLE CHANGE COLUMN syntax #26338
Conversation
@@ -369,6 +369,13 @@ class ResolveSessionCatalog( | |||
AlterTableRecoverPartitionsCommand( | |||
v1TableName.asTableIdentifier, | |||
"ALTER TABLE RECOVER PARTITIONS") | |||
|
|||
case AlterTableChangeColumnStatement(tableName, columnName, newColumn) => | |||
val v1TableName = parseV1Table(tableName, "ALTER TABLE CHANGE COLUMN") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this command can also work on V2, if old column name and new column name is the same. cc @cloud-fan
@@ -151,7 +151,7 @@ statement | |||
| ALTER TABLE multipartIdentifier | |||
(ALTER | CHANGE) COLUMN? qualifiedName | |||
(TYPE dataType)? (COMMENT comment=STRING)? colPosition? #alterTableColumn | |||
| ALTER TABLE tableIdentifier partitionSpec? | |||
| ALTER TABLE multipartIdentifier partitionSpec? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, the parser rule above seems to include this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the above one is for v2 command. two rules have a bit difference. v1 rule can specify new column name with data type (i.e., colType
), but v2 rule can only specify data type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not combine two rules and two statements. A possible combine might have few Option fields and a logic to interpret it to v1/v2 cases, a bit mess it sounds like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The v2 rule
(ALTER | CHANGE) COLUMN? qualifiedName (TYPE dataType)? (COMMENT comment=STRING)?
The colType
colName=errorCapturingIdentifier dataType (COMMENT STRING)?
Seems like the v1 rule is more powerful and can rename a column and change data type together. If other DBs support it as well, maybe Spark should also support it. Otherwise, maybe 3.0 is a good time to drop this syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems we do not really support it, though we allow such syntax:
spark/sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala
Lines 343 to 351 in 7aca0dd
// Find the origin column from dataSchema by column name. | |
val originColumn = findColumnByName(table.dataSchema, columnName, resolver) | |
// Throw an AnalysisException if the column name/dataType is changed. | |
if (!columnEqual(originColumn, newColumn, resolver)) { | |
throw new AnalysisException( | |
"ALTER TABLE CHANGE COLUMN is not supported for changing column " + | |
s"'${originColumn.name}' with type '${originColumn.dataType}' to " + | |
s"'${newColumn.name}' with type '${newColumn.dataType}'") | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what if the original plan was to support that in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's simply remove this syntax. We have ALTER TABLE CHANGE COLUMN and ALTER TABLE RENAME COLUMN, which is good enough to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. let's remove it.
Test build #113006 has finished for PR 26338 at commit
|
Test build #113122 has finished for PR 26338 at commit
|
Test build #113123 has finished for PR 26338 at commit
|
@@ -2,54 +2,38 @@ | |||
CREATE TABLE test_change(a INT, b STRING, c INT) using parquet; | |||
DESC test_change; | |||
|
|||
-- Change column name (not supported yet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have a test file for RENAME COLUMN? If not can we add some tests here to keep the test coverage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
Test build #113192 has finished for PR 26338 at commit
|
Test build #113238 has finished for PR 26338 at commit
|
Test build #113237 has finished for PR 26338 at commit
|
Test build #113240 has finished for PR 26338 at commit
|
Test build #113245 has finished for PR 26338 at commit
|
retest this please. |
Test build #113249 has finished for PR 26338 at commit
|
@@ -149,11 +149,8 @@ statement | |||
| ALTER (TABLE | VIEW) multipartIdentifier | |||
UNSET TBLPROPERTIES (IF EXISTS)? tablePropertyList #unsetTableProperties | |||
| ALTER TABLE multipartIdentifier | |||
(ALTER | CHANGE) COLUMN? qualifiedName | |||
(ALTER | CHANGE) COLUMN? multipartIdentifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously we use qualifiedName: identifier ('.' identifier)*
to capture column name.
This conflicts a test in ErrorParserSuite that test-col
is not allowed in ALTER TABLE t CHANGE COLUMN test-col TYPE BIGINT
.
The column name should be multiple errorCapturingIdentifier. So I changed it to multipartIdentifier:
multipartIdentifier
: parts+=errorCapturingIdentifier ('.' parts+=errorCapturingIdentifier)*
;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like we should replace qualifiedName
with multipartIdentifier
in all other places. We can do it in a followup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. will make a followup later.
Test build #113275 has finished for PR 26338 at commit
|
@@ -149,11 +149,8 @@ statement | |||
| ALTER (TABLE | VIEW) multipartIdentifier | |||
UNSET TBLPROPERTIES (IF EXISTS)? tablePropertyList #unsetTableProperties | |||
| ALTER TABLE multipartIdentifier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can distinguish the two multipartIdentifier
ALTER TABLE table=multipartIdentifier ... COLUMN? column=multipartIdentifier
You can fix it in your followup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
thanks, merging to master! |
### What changes were proposed in this pull request? Revert #26338 , as the syntax is actually the [hive style ALTER COLUMN](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ChangeColumnName/Type/Position/Comment). This PR brings it back, and make it support multi-catalog: 1. renaming is not allowed as `AlterTableAlterColumnStatement` can't do renaming. 2. column name should be multi-part ### Why are the changes needed? to not break hive compatibility. ### Does this PR introduce any user-facing change? no, as the removal was merged in 3.0. ### How was this patch tested? new parser tests Closes #27076 from cloud-fan/alter. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Revert apache/spark#26338 , as the syntax is actually the [hive style ALTER COLUMN](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-ChangeColumnName/Type/Position/Comment). This PR brings it back, and make it support multi-catalog: 1. renaming is not allowed as `AlterTableAlterColumnStatement` can't do renaming. 2. column name should be multi-part to not break hive compatibility. no, as the removal was merged in 3.0. new parser tests Closes #27076 from cloud-fan/alter. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
This patch removes v1 ALTER TABLE CHANGE COLUMN syntax.
Why are the changes needed?
Since in v2 we have ALTER TABLE CHANGE COLUMN and ALTER TABLE RENAME COLUMN, this old syntax is not necessary now and can be confusing.
The v2 ALTER TABLE CHANGE COLUMN should fallback to v1 AlterTableChangeColumnCommand (#26354).
Does this PR introduce any user-facing change?
Yes, the old v1 ALTER TABLE CHANGE COLUMN syntax is removed.
How was this patch tested?
Unit tests.