use specialized axpy in RowMatrix for SVD #1378

vrilleup · 2014-07-11T23:19:33Z

After running some more tests on large matrix, found that the BV axpy (breeze/linalg/Vector.scala, axpy) is slower than the BSV axpy (breeze/linalg/operators/SparseVectorOps.scala, sv_dv_axpy), 8s v.s. 2s for each multiplication. The BV axpy operates on an iterator while BSV axpy directly operates on the underlying array. I think the overhead comes from creating the iterator (with a zip) and advancing the pointers.

copy ARPACK dsaupd/dseupd code from latest breeze change RowMatrix to use sparse SVD change tests for sparse SVD

change the computation mode to local-svd, local-eigs, and dist-eigs update tests and docs

Some updates to SVD impl

AmplabJenkins · 2014-07-11T23:21:19Z

Can one of the admins verify this patch?

vrilleup · 2014-07-11T23:25:29Z

hmmm, I did sync with the upstream branch before committing the last change, it seems that the whole commit history is still there...

mengxr · 2014-07-12T00:12:31Z

Jenkins, add to whitelist.

mengxr · 2014-07-12T00:12:41Z

Jenkins, test this please.

SparkQA · 2014-07-12T00:16:32Z

QA tests have started for PR 1378. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16580/consoleFull

SparkQA · 2014-07-12T01:53:30Z

QA results for PR 1378:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16580/consoleFull

mengxr · 2014-07-12T06:26:16Z

@vrilleup We squash commits before merging a PR. The commit history show up since you used your master branch for this PR but apache/master doesn't have those commits.

The change looks good to me. Thanks for testing the performance!

mengxr · 2014-07-12T06:27:09Z

Merged.

vrilleup · 2014-07-12T08:05:51Z

@mengxr thank you for merging the change!

After running some more tests on large matrix, found that the BV axpy (breeze/linalg/Vector.scala, axpy) is slower than the BSV axpy (breeze/linalg/operators/SparseVectorOps.scala, sv_dv_axpy), 8s v.s. 2s for each multiplication. The BV axpy operates on an iterator while BSV axpy directly operates on the underlying array. I think the overhead comes from creating the iterator (with a zip) and advancing the pointers. Author: Li Pu <lpu@twitter.com> Author: Xiangrui Meng <meng@databricks.com> Author: Li Pu <li.pu@outlook.com> Closes apache#1378 from vrilleup/master and squashes the following commits: 6fb01a3 [Li Pu] use specialized axpy in RowMatrix 5255f2a [Li Pu] Merge remote-tracking branch 'upstream/master' 7312ec1 [Li Pu] very minor comment fix 4c618e9 [Li Pu] Merge pull request apache#1 from mengxr/vrilleup-master a461082 [Xiangrui Meng] make superscript show up correctly in doc 861ec48 [Xiangrui Meng] simplify axpy 62969fa [Xiangrui Meng] use BDV directly in symmetricEigs change the computation mode to local-svd, local-eigs, and dist-eigs update tests and docs c273771 [Li Pu] automatically determine SVD compute mode and parameters 7148426 [Li Pu] improve RowMatrix multiply 5543cce [Li Pu] improve svd api 819824b [Li Pu] add flag for dense svd or sparse svd eb15100 [Li Pu] fix binary compatibility 4c7aec3 [Li Pu] improve comments e7850ed [Li Pu] use aggregate and axpy 827411b [Li Pu] fix EOF new line 9c80515 [Li Pu] use non-sparse implementation when k = n fe983b0 [Li Pu] improve scala style 96d2ecb [Li Pu] improve eigenvalue sorting e1db950 [Li Pu] SPARK-1782: svd for sparse matrix using ARPACK

Li Pu and others added 19 commits June 3, 2014 18:05

SPARK-1782: svd for sparse matrix using ARPACK

e1db950

copy ARPACK dsaupd/dseupd code from latest breeze change RowMatrix to use sparse SVD change tests for sparse SVD

improve eigenvalue sorting

96d2ecb

improve scala style

fe983b0

use non-sparse implementation when k = n

9c80515

fix EOF new line

827411b

use aggregate and axpy

e7850ed

improve comments

4c7aec3

fix binary compatibility

eb15100

add flag for dense svd or sparse svd

819824b

improve svd api

5543cce

improve RowMatrix multiply

7148426

automatically determine SVD compute mode and parameters

c273771

use BDV directly in symmetricEigs

62969fa

change the computation mode to local-svd, local-eigs, and dist-eigs update tests and docs

simplify axpy

861ec48

make superscript show up correctly in doc

a461082

Merge pull request #1 from mengxr/vrilleup-master

4c618e9

Some updates to SVD impl

very minor comment fix

7312ec1

Merge remote-tracking branch 'upstream/master'

5255f2a

use specialized axpy in RowMatrix

6fb01a3

asfgit closed this in d38887b Jul 12, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use specialized axpy in RowMatrix for SVD #1378

use specialized axpy in RowMatrix for SVD #1378

vrilleup commented Jul 11, 2014

AmplabJenkins commented Jul 11, 2014

vrilleup commented Jul 11, 2014

mengxr commented Jul 12, 2014

mengxr commented Jul 12, 2014

SparkQA commented Jul 12, 2014

SparkQA commented Jul 12, 2014

mengxr commented Jul 12, 2014

mengxr commented Jul 12, 2014

vrilleup commented Jul 12, 2014

use specialized axpy in RowMatrix for SVD #1378

use specialized axpy in RowMatrix for SVD #1378

Conversation

vrilleup commented Jul 11, 2014

AmplabJenkins commented Jul 11, 2014

vrilleup commented Jul 11, 2014

mengxr commented Jul 12, 2014

mengxr commented Jul 12, 2014

SparkQA commented Jul 12, 2014

SparkQA commented Jul 12, 2014

mengxr commented Jul 12, 2014

mengxr commented Jul 12, 2014

vrilleup commented Jul 12, 2014