Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23418][SQL]: Fail DataSourceV2 reads when user schema is passed, but not supported. #20603

Closed
wants to merge 1 commit into from

Conversation

rdblue
Copy link
Contributor

@rdblue rdblue commented Feb 13, 2018

What changes were proposed in this pull request?

DataSourceV2 initially allowed user-supplied schemas when a source doesn't implement ReadSupportWithSchema, as long as the schema was identical to the source's schema. This is confusing behavior because changes to an underlying table can cause a previously working job to fail with an exception that user-supplied schemas are not allowed.

This reverts commit adcb25a0624, which was added to #20387 so that it could be removed in a separate JIRA issue and PR.

How was this patch tested?

Existing tests.

@rdblue
Copy link
Contributor Author

rdblue commented Feb 13, 2018

@cloud-fan, here's the PR to fix user-supplied schema behavior. Once #20387 is committed, I'll rebase and remove its commits from this PR.

@SparkQA
Copy link

SparkQA commented Feb 13, 2018

Test build #87428 has finished for PR 20603 at commit f623080.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 14, 2018

Test build #87431 has finished for PR 20603 at commit bd06193.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

This reverts commit adcb25a06240dc413f58b2d1240405b0a5485578.
@rdblue rdblue force-pushed the SPARK-23418-revert-adcb25a0624 branch from bd06193 to a963a5d Compare February 20, 2018 17:09
@rdblue
Copy link
Contributor Author

rdblue commented Feb 20, 2018

@cloud-fan, this implements SPARK-23418, which rejects user-supplied schemas when ReadSupportWithSchema is not available.

@SparkQA
Copy link

SparkQA commented Feb 20, 2018

Test build #87559 has finished for PR 20603 at commit a963a5d.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@asfgit asfgit closed this in c8c4441 Feb 21, 2018
@rdblue
Copy link
Contributor Author

rdblue commented Feb 21, 2018

Thanks for reviewing, @cloud-fan!

otterc pushed a commit to linkedin/spark that referenced this pull request Mar 22, 2023
…, but not supported.

DataSourceV2 initially allowed user-supplied schemas when a source doesn't implement `ReadSupportWithSchema`, as long as the schema was identical to the source's schema. This is confusing behavior because changes to an underlying table can cause a previously working job to fail with an exception that user-supplied schemas are not allowed.

This reverts commit adcb25a0624, which was added to apache#20387 so that it could be removed in a separate JIRA issue and PR.

Existing tests.

Author: Ryan Blue <blue@apache.org>

Closes apache#20603 from rdblue/SPARK-23418-revert-adcb25a0624.

Ref: LIHADOOP-48531
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants