Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "[SPARK-30642][SPARK-30659][SPARK-30660][SPARK-30662]" #27487

Closed
wants to merge 4 commits into from

Conversation

zhengruifeng
Copy link
Contributor

@zhengruifeng zhengruifeng commented Feb 7, 2020

What changes were proposed in this pull request?

Revert
#27360
#27396
#27374
#27389

Why are the changes needed?

BLAS need more performace tests, specially on sparse datasets.
Perfermance test of LogisticRegression (#27374) on sparse dataset shows that blockify vectors to matrices and use BLAS will cause performance regression.
LinearSVC and LinearRegression were also updated in the same way as LogisticRegression, so we need to revert them to make sure no regression.

Does this PR introduce any user-facing change?

remove newly added param blockSize

How was this patch tested?

reverted testsuites

@dongjoon-hyun
Copy link
Member

Wow. It's a massive revert. If you have a discussion on the other PRs, could you add some pointers into the description? What is the performance regression?

BLAS need more performace tests, specially on sparse datasets

@zhengruifeng
Copy link
Contributor Author

@dongjoon-hyun There is some test and discussion in #27374, I also update the description.

@dongjoon-hyun
Copy link
Member

Thank you, @zhengruifeng .

@dongjoon-hyun
Copy link
Member

Also, cc @huaxingao since she is also the author of the last PR.

@SparkQA
Copy link

SparkQA commented Feb 7, 2020

Test build #118014 has finished for PR 27487 at commit 3407dc1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 7, 2020

Test build #118016 has finished for PR 27487 at commit 255ca22.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class _ALSModelParams(HasPredictionCol):

@SparkQA
Copy link

SparkQA commented Feb 7, 2020

Test build #118020 has finished for PR 27487 at commit 0dafab5.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhengruifeng
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Feb 7, 2020

Test build #118022 has finished for PR 27487 at commit 0dafab5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 7, 2020

Test build #118029 has finished for PR 27487 at commit 7b09306.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhengruifeng
Copy link
Contributor Author

cc @mengxr @WeichenXu123 @srowen

Copy link
Contributor

@WeichenXu123 WeichenXu123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@zhengruifeng
Copy link
Contributor Author

Merged to master

@zhengruifeng zhengruifeng deleted the revert_blockify_ii branch February 8, 2020 00:47
@srowen
Copy link
Member

srowen commented Feb 8, 2020

Thank you. I should have realized that sparse vectors could be an issue in the first place. That's my bad.

@zhengruifeng
Copy link
Contributor Author

@srowen Never mind, I should have maken more perfermance tests. Maybe we should do such refactors only when we have enough time for testing. When working on these PRs, it maybe too close to the code freeze.

@viirya
Copy link
Member

viirya commented Feb 25, 2020

Is this to revert the PRs on branch-3.0 too? I cannot find it in branch-3.0 commits.

zhengruifeng added a commit that referenced this pull request Feb 25, 2020
### What changes were proposed in this pull request?
Revert
#27360
#27396
#27374
#27389

### Why are the changes needed?
BLAS need more performace tests, specially on sparse datasets.
Perfermance test of LogisticRegression (#27374) on sparse dataset shows that blockify vectors to matrices and use BLAS will cause performance regression.
LinearSVC and LinearRegression were also updated in the same way as LogisticRegression, so we need to revert them to make sure no regression.

### Does this PR introduce any user-facing change?
remove newly added param blockSize

### How was this patch tested?
reverted testsuites

Closes #27487 from zhengruifeng/revert_blockify_ii.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
@zhengruifeng
Copy link
Contributor Author

@viirya Yes, this is to revert on branch-3.0 too. I am go to revert it on 3.0. Thanks for pointing it out!

sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
### What changes were proposed in this pull request?
Revert
apache#27360
apache#27396
apache#27374
apache#27389

### Why are the changes needed?
BLAS need more performace tests, specially on sparse datasets.
Perfermance test of LogisticRegression (apache#27374) on sparse dataset shows that blockify vectors to matrices and use BLAS will cause performance regression.
LinearSVC and LinearRegression were also updated in the same way as LogisticRegression, so we need to revert them to make sure no regression.

### Does this PR introduce any user-facing change?
remove newly added param blockSize

### How was this patch tested?
reverted testsuites

Closes apache#27487 from zhengruifeng/revert_blockify_ii.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: zhengruifeng <ruifengz@foxmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants