Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-33518][ML][FOLLOWUP] MatrixFactorizationModel use GEMV #31279

Closed
wants to merge 1 commit into from

Conversation

zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

1, update related doc;
2, MatrixFactorizationModel use GEMV;

Why are the changes needed?

see performance gain in #30468

Does this PR introduce any user-facing change?

NO

How was this patch tested?

existing testsuites

@SparkQA
Copy link

SparkQA commented Jan 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38911/

@SparkQA
Copy link

SparkQA commented Jan 21, 2021

Test build #134324 has finished for PR 31279 at commit d94894f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 21, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/38911/

@AmplabJenkins
Copy link

Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/134324/

@AmplabJenkins
Copy link

Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/38911/

@zhengruifeng
Copy link
Contributor Author

ping @srowen

val n = dstIds.length
if (scores == null || scores.length < n) {
scores = Array.ofDim[Double](n)
idxOrd = new GuavaOrdering[Int] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this can be defined outside the flatMap, but it won't matter much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the one in ml.ALS, however, in ml.ALS, vector are represented as array[Float], here use array[Double].

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I guess same goes for the other.. idxOrd doesn't change right? No big deal.

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Ok to merge if tests pass.

@zhengruifeng
Copy link
Contributor Author

merged to master, thanks @srowen

@zhengruifeng zhengruifeng deleted the als_follow_up branch January 27, 2021 02:07
skestle pushed a commit to skestle/spark that referenced this pull request Feb 3, 2021
### What changes were proposed in this pull request?
1, update related doc;
2, MatrixFactorizationModel use GEMV;

### Why are the changes needed?
see performance gain in apache#30468

### Does this PR introduce _any_ user-facing change?
NO

### How was this patch tested?
existing testsuites

Closes apache#31279 from zhengruifeng/als_follow_up.

Authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Signed-off-by: Ruifeng Zheng <ruifengz@foxmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants