[SPARK-20587][ML] Improve performance of ML ALS recommendForAll #17845

MLnick · 2017-05-03T19:18:56Z

This PR is a DataFrame version of #17742 for SPARK-11968, for improving the performance of recommendAll methods.

How was this patch tested?

Existing unit tests.

MLnick · 2017-05-03T19:19:16Z

cc @mpjlu

Also @srowen @sethah @jkbradley

MLnick · 2017-05-03T20:02:29Z

Some quick perf numbers:

Using ml-latest dataset (~24 million ratings, ~260k users, ~39k movies); 4x workers 48 cores 100GB RAM each.

rank = 10, k = 10

master	this PR
`recommendForAllUsers`
369s	16s
`recommendForAllItems`
547s	15s

So 23-37x improvement.

SparkQA · 2017-05-03T20:18:00Z

Test build #76424 has finished for PR 17845 at commit baeadd0.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

sethah

first pass on style.

sethah · 2017-05-03T21:28:58Z

mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala

+              score += srcFactor(k) * dstFactor(k)
+              k += 1
+            }
+            pq += { (dstId, score) }


pq += dstId -> score?

sethah · 2017-05-03T21:39:02Z

mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala

@@ -389,6 +436,17 @@ class ALSModel private[ml] (
    )
    recs.select($"id" as srcOutputColumn, $"recommendations" cast arrayType)


This is discouraged within Spark: https://github.com/databricks/scala-style-guide#infix-methods

Fair point - may as well fix it while here

sethah · 2017-05-03T21:39:43Z

mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala

+   */
+  private def blockify(
+    factors: Dataset[(Int, Array[Float])],
+    /* TODO make blockSize a param? */blockSize: Int = 4096): Dataset[Seq[(Int, Array[Float])]] = {


just put the comment in the doc and reference a JIRA.

sethah · 2017-05-03T21:40:26Z

mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala

+   * relatively efficient, the approach implemented here is significantly more efficient.
+   *
+   * This approach groups factors into blocks and computes the top-k elements per block,
+   * using Level 1 BLAS (dot) and an efficient [[BoundedPriorityQueue]]. It then computes the


below we say that blas is not used.

How about "... using dot product instead of gemm and an efficient ..."

sethah · 2017-05-03T21:44:52Z

mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala

+        val pq = new BoundedPriorityQueue[(Int, Float)](num)(Ordering.by(_._2))
+        srcIter.foreach { case (srcId, srcFactor) =>
+          dstIter.foreach { case (dstId, dstFactor) =>
+            /**


don't use doc notation. Maybe we can reduce it to:

/* * The below code is equivalent to * `val score = blas.sdot(rank, srcFactor, 1, dstFactor, 1)` * The handwritten version is as or more efficient as BLAS calls in this case. */

Sounds good

MLnick · 2017-05-04T06:36:02Z

Thanks @sethah will update shortly

SparkQA · 2017-05-04T08:20:39Z

Test build #76447 has finished for PR 17845 at commit cf35eea.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

MLnick · 2017-05-08T10:49:39Z

jenkins retest this please

SparkQA · 2017-05-08T14:02:45Z

Test build #76571 has finished for PR 17845 at commit cf35eea.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jkbradley

LGTM; thanks for doing this! Feel free to merge or address my 1 comment

jkbradley · 2017-05-08T20:48:24Z

mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala

+        val m = srcIter.size
+        val n = math.min(dstIter.size, num)
+        val output = new Array[(Int, Int, Float)](m * n)
+        var j = 0


Nit: You could combine j and i; you really just need 1 counter.

j iterates through src ids while i iterates through dst ids in the queue for each src id. So I don't think they can be combined.

Anyway the iter.next() code is a bit ugly and since it's at most k elements it's not really performance critical, so could just use foreach I think

jkbradley · 2017-05-08T21:05:46Z

One more comment I'll copy from the other PR: I'm not a fan of custom BLAS implementations scattered throughout MLlib. Could you please follow up by putting the dot as a private API in BLAS.scala and adding unit tests?

MLnick · 2017-05-09T08:12:35Z

Merged to master/branch-2.2

Thanks @mpjlu for the original work on the approach!

This PR is a `DataFrame` version of #17742 for [SPARK-11968](https://issues.apache.org/jira/browse/SPARK-11968), for improving the performance of `recommendAll` methods. ## How was this patch tested? Existing unit tests. Author: Nick Pentreath <nickp@za.ibm.com> Closes #17845 from MLnick/ml-als-perf. (cherry picked from commit 10b00ab) Signed-off-by: Nick Pentreath <nickp@za.ibm.com>

Small clean ups from #17742 and #17845. ## How was this patch tested? Existing unit tests. Author: Nick Pentreath <nickp@za.ibm.com> Closes #17919 from MLnick/SPARK-20677-als-perf-followup. (cherry picked from commit 25b4f41) Signed-off-by: Nick Pentreath <nickp@za.ibm.com>

Small clean ups from apache#17742 and apache#17845. ## How was this patch tested? Existing unit tests. Author: Nick Pentreath <nickp@za.ibm.com> Closes apache#17919 from MLnick/SPARK-20677-als-perf-followup.

This PR is a `DataFrame` version of apache#17742 for [SPARK-11968](https://issues.apache.org/jira/browse/SPARK-11968), for improving the performance of `recommendAll` methods. ## How was this patch tested? Existing unit tests. Author: Nick Pentreath <nickp@za.ibm.com> Closes apache#17845 from MLnick/ml-als-perf.

Small clean ups from apache#17742 and apache#17845. ## How was this patch tested? Existing unit tests. Author: Nick Pentreath <nickp@za.ibm.com> Closes apache#17919 from MLnick/SPARK-20677-als-perf-followup.

Nick Pentreath added 3 commits May 3, 2017 11:36

First cut

4fd11b9

Fix increment error

29d6777

Move PQ outside of foreach and update comments

baeadd0

MLnick mentioned this pull request May 3, 2017

[Spark-11968][ML][MLLIB]Optimize MLLIB ALS recommendForAll #17742

Closed

sethah reviewed May 3, 2017

View reviewed changes

Address review style comments

cf35eea

jkbradley reviewed May 8, 2017

View reviewed changes

asfgit closed this in 10b00ab May 9, 2017

MLnick mentioned this pull request May 9, 2017

[SPARK-20677][MLLIB][ML] Follow-up to ALS recommend-all performance PRs #17919

Closed

zhengruifeng mentioned this pull request Nov 23, 2020

[SPARK-33518][ML] Improve performance of ML ALS recommendForAll by GEMV #30468

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-20587][ML] Improve performance of ML ALS recommendForAll #17845

[SPARK-20587][ML] Improve performance of ML ALS recommendForAll #17845

MLnick commented May 3, 2017

MLnick commented May 3, 2017

MLnick commented May 3, 2017

SparkQA commented May 3, 2017

sethah left a comment

sethah May 3, 2017

MLnick May 4, 2017

sethah May 3, 2017

MLnick May 4, 2017

sethah May 3, 2017

MLnick May 4, 2017

sethah May 3, 2017

MLnick May 4, 2017

sethah May 3, 2017

MLnick May 4, 2017

MLnick commented May 4, 2017

SparkQA commented May 4, 2017

MLnick commented May 8, 2017

SparkQA commented May 8, 2017

jkbradley left a comment

jkbradley May 8, 2017

MLnick May 9, 2017

MLnick May 9, 2017

jkbradley commented May 8, 2017

MLnick commented May 9, 2017

		@@ -389,6 +436,17 @@ class ALSModel private[ml] (
		)
		recs.select($"id" as srcOutputColumn, $"recommendations" cast arrayType)

[SPARK-20587][ML] Improve performance of ML ALS recommendForAll #17845

[SPARK-20587][ML] Improve performance of ML ALS recommendForAll #17845

Conversation

MLnick commented May 3, 2017

How was this patch tested?

MLnick commented May 3, 2017

MLnick commented May 3, 2017

SparkQA commented May 3, 2017

sethah left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MLnick commented May 4, 2017

SparkQA commented May 4, 2017

MLnick commented May 8, 2017

SparkQA commented May 8, 2017

jkbradley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jkbradley commented May 8, 2017

MLnick commented May 9, 2017