Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-30226][SQL] Remove withXXX functions in WriteBuilder #26678

Closed
wants to merge 15 commits into from

Conversation

edrevo
Copy link
Contributor

@edrevo edrevo commented Nov 26, 2019

What changes were proposed in this pull request?

Adding a LogicalWriteInfo interface as suggested by @cloud-fan in #25990 (comment)

Why are the changes needed?

It provides compile-time guarantees where we previously had none, which will make it harder to introduce bugs in the future.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Compiles and passes tests

@edrevo
Copy link
Contributor Author

edrevo commented Nov 26, 2019

Do I need to open a new JIRA for this @cloud-fan ?

@cloud-fan
Copy link
Contributor

ok to test

@cloud-fan
Copy link
Contributor

@edrevo yes, please create a new JIRA

@SparkQA
Copy link

SparkQA commented Nov 26, 2019

Test build #114460 has finished for PR 26678 at commit 0a83e8f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

@edrevo any updates?

@edrevo
Copy link
Contributor Author

edrevo commented Dec 10, 2019

Sorry for the delay @cloud-fan , I'll pick this up again this week

@edrevo edrevo changed the title Add logical write info [SPARK-30226][SQL] Remove withXXX functions in WriteBuilder Dec 11, 2019
@SparkQA
Copy link

SparkQA commented Dec 11, 2019

Test build #115199 has finished for PR 26678 at commit 2d1f853.

  • This patch fails Java style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@edrevo
Copy link
Contributor Author

edrevo commented Dec 12, 2019

@cloud-fan , I have addressed your PR feedback.

Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except a few minor comments

@SparkQA
Copy link

SparkQA commented Dec 12, 2019

Test build #115218 has finished for PR 26678 at commit 75b7ec9.

  • This patch fails to generate documentation.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

ping @edrevo

@SparkQA
Copy link

SparkQA commented Dec 18, 2019

Test build #115519 has finished for PR 26678 at commit b9f3e4f.

  • This patch fails Java style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 18, 2019

Test build #115520 has finished for PR 26678 at commit d811d71.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@edrevo
Copy link
Contributor Author

edrevo commented Dec 19, 2019

@cloud-fan, latest PR feedback addressed

@cloud-fan
Copy link
Contributor

LGTM, also cc @rdblue @brkyvz

@rdblue
Copy link
Contributor

rdblue commented Dec 19, 2019

I don't see a compelling reason to change the API again this close to a release. Why does this need to happen? What is the benefit? The only one I see is minor and in my opinion not worth the extra churn.

@cloud-fan
Copy link
Contributor

cloud-fan commented Dec 20, 2019

I've posted the reason a long time ago: #26591 (comment)

I think it's not minor and is an important change. It can avoid us making mistakes like adding a withXXX method but forget to call it in streaming execution. We already made this mistake before.

We haven't cut branch yet so it's not too late.

@rdblue
Copy link
Contributor

rdblue commented Dec 20, 2019

I think there are other ways to ensure you're using a builder correctly. For one thing, adding validations to the internal implementations makes sense.

In general, we need to stop making breaking changes to the API simply because it is marginally better.

@cloud-fan
Copy link
Contributor

I think an argument here is if it's marginally better or much better. I think it's much better.

Another question is: when should we stop making breaking changes to V2? We have made so many API changes for DS v2 in Spark 3.0 and I'm confused why we can't make API changes now? If there is a point that we should stop making breaking changes, it should be the branch cut. Before that, I think it's worth to make API changes even if it's slightly better. We don't have chances to fix these small API problems anymore after branch cut.

cc @srowen @dongjoon-hyun @dbtsai

@cloud-fan
Copy link
Contributor

There is a proposal to do branch cut on Jan 31, so we still have more than a month to add new features. Since we all agree that this is an improvement(minor or significant), shall we get this in?

@cloud-fan
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Jan 6, 2020

Test build #116135 has finished for PR 26678 at commit d811d71.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Jan 6, 2020

Test build #116143 has finished for PR 26678 at commit d811d71.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

It has been over 2 weeks and no more objections so far. Since this is an improvement (type-safe is better than manual check) and DS v2 API change is not a concern before branch cut, I'm merging it to master. Please keep leaving comments if you have different opinions. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants