Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23085][ML] API parity for mllib.linalg.Vectors.sparse #20275

Closed
wants to merge 4 commits into from

Conversation

zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

ML.Vectors#sparse(size: Int, elements: Seq[(Int, Double)]) support zero-length

How was this patch tested?

existing tests

@SparkQA
Copy link

SparkQA commented Jan 16, 2018

Test build #86158 has finished for PR 20275 at commit 1a3cd3a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mgaido91
Copy link
Contributor

as far as I know mllib is not maintained anymore, thus I am not sure this fix is useful

@srowen
Copy link
Member

srowen commented Jan 16, 2018

That's true, although, there's no conceptual reason not to support a 0-size vector, and the existing checks even checked for >= 0 in one case. This is arguably just fixing internal consistency. How about a unit test?

@SparkQA
Copy link

SparkQA commented Jan 17, 2018

Test build #86225 has finished for PR 20275 at commit ac067e7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jan 17, 2018

Unless someone like @jkbradley or @MLnick objects, I think this doesn't hurt and actually makes things a little more consistent.

Copy link
Contributor

@MLnick MLnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine to add for consistency, even though mllib is no longer actively developed.

But we should round out the tests - I left a couple comments on that.

@@ -113,6 +113,13 @@ class VectorsSuite extends SparkFunSuite with Logging {
assert(vec.toArray === arr)
}

test("zero-length sparse vector") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may as well add the same test to ml.linalg

@@ -113,6 +113,13 @@ class VectorsSuite extends SparkFunSuite with Logging {
assert(vec.toArray === arr)
}

test("zero-length sparse vector") {
Copy link
Contributor

@MLnick MLnick Jan 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're doing this we may as well also add a test to intercept the exception for negative size (since the other sparse vector construction tests check for the other constraints but a size test is missing), for both ml and mllib

@SparkQA
Copy link

SparkQA commented Jan 19, 2018

Test build #86373 has finished for PR 20275 at commit f3a4329.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@MLnick MLnick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@srowen
Copy link
Member

srowen commented Jan 19, 2018

Merged to master

@asfgit asfgit closed this in 606a748 Jan 19, 2018
@zhengruifeng zhengruifeng deleted the SparseVector_size branch January 22, 2018 01:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants