expose improvementTol inside FirstOrderMinimizer #373

coderxiang · 2015-03-02T22:00:15Z

When I run the LBFGS implementation of Spark on a moderate size proprietary data set (~ 5M samples), I see the resultant vector diverges from some standard solver (e.g., liblinear, sci-learn) on around half of the coefficients. The computation usually stops around 20 iterations although I set the iteration number to be 100 with a very strict tolerance variable (1e-15). The formulation is strongly convex so the solution should be unique. Configing the improvementTol variable inside FirstOrderMinimizer.scala can solve this issue.

Two ways of doing this:

Exposing the improvementTol variable to LBFGS, as is done in the PR
Change the default value of improvementTol in FirstOrderMinimizer.scala

I chose the first one here.

dlwh · 2015-03-02T22:03:18Z

thanks. Yeah, I need to figure out the convergence checks.

expose improvementTol inside FirstOrderMinimizer

dbtsai · 2015-03-03T02:17:19Z

@dlwh, are you going to publish this so I can upgrade Spark 1.3 dependency before it's released? thanks.

dlwh · 2015-03-03T02:32:08Z

sure. what's the timeline?

On Mon, Mar 2, 2015 at 6:17 PM, DB Tsai notifications@github.com wrote:

@dlwh https://github.com/dlwh, are you going to publish this so I can
upgrade Spark 1.3 dependency before it's released? thanks.

—
Reply to this email directly or view it on GitHub
#373 (comment).

dbtsai · 2015-03-03T02:49:06Z

Is OWLQN affected by this issue? Spark 1.3 is RC2, so if we want to change, we need to be fast. Thanks.

dlwh · 2015-03-03T03:17:09Z

ok, Debasish has been working on his new optimization algorithm. I imagine
that that's post Spark 1.3 in any case?

On Mon, Mar 2, 2015 at 6:49 PM, DB Tsai notifications@github.com wrote:

Is OWLQN affected by this issue? Spark 1.3 is RC2, so if we want to
change, we need to be fast. Thanks.

—
Reply to this email directly or view it on GitHub
#373 (comment).

dlwh · 2015-03-03T03:18:04Z

owlqn's convergence conditions are also fixed by this.

On Mon, Mar 2, 2015 at 7:17 PM, David Hall david.lw.hall@gmail.com wrote:

ok, Debasish has been working on his new optimization algorithm. I imagine
that that's post Spark 1.3 in any case?

On Mon, Mar 2, 2015 at 6:49 PM, DB Tsai notifications@github.com wrote:

Is OWLQN affected by this issue? Spark 1.3 is RC2, so if we want to
change, we need to be fast. Thanks.

—
Reply to this email directly or view it on GitHub
#373 (comment).

dbtsai · 2015-03-03T03:28:28Z

Sounds great. Then let's publish the breeze with this fix, and I'll open another PR to Spark.

dlwh · 2015-03-03T03:55:23Z

0.11 is released. Will take a bit to propagate to maven central.

…ce bug LBFGS and OWLQN in Breeze 0.10 has convergence check bug. This is fixed in 0.11, see the description in Breeze project for detail: scalanlp/breeze#373 (comment) Author: Xiangrui Meng <meng@databricks.com> Author: DB Tsai <dbtsai@alpinenow.com> Author: DB Tsai <dbtsai@dbtsai.com> Closes #4879 from dbtsai/breeze and squashes the following commits: d848f65 [DB Tsai] Merge pull request #1 from mengxr/AlpineNow-breeze c2ca6ac [Xiangrui Meng] upgrade to breeze-0.11.1 35c2f26 [Xiangrui Meng] fix LRSuite 397a208 [DB Tsai] upgrade breeze (cherry picked from commit 76e20a0) Signed-off-by: Xiangrui Meng <meng@databricks.com>

…ce bug LBFGS and OWLQN in Breeze 0.10 has convergence check bug. This is fixed in 0.11, see the description in Breeze project for detail: scalanlp/breeze#373 (comment) Author: Xiangrui Meng <meng@databricks.com> Author: DB Tsai <dbtsai@alpinenow.com> Author: DB Tsai <dbtsai@dbtsai.com> Closes #4879 from dbtsai/breeze and squashes the following commits: d848f65 [DB Tsai] Merge pull request #1 from mengxr/AlpineNow-breeze c2ca6ac [Xiangrui Meng] upgrade to breeze-0.11.1 35c2f26 [Xiangrui Meng] fix LRSuite 397a208 [DB Tsai] upgrade breeze

expose improvementTol inside FirstOrderMinimizer

6956399

dlwh added a commit that referenced this pull request Mar 2, 2015

Merge pull request #373 from coderxiang/lbfgs-tol

7d42e37

expose improvementTol inside FirstOrderMinimizer

dlwh merged commit 7d42e37 into scalanlp:master Mar 2, 2015

coderxiang deleted the lbfgs-tol branch March 2, 2015 22:11

dbtsai mentioned this pull request Mar 3, 2015

[SPARK-6141][MLlib] Upgrade Breeze from 0.10 to 0.11 to fix convergence bug apache/spark#4879

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

expose improvementTol inside FirstOrderMinimizer #373

expose improvementTol inside FirstOrderMinimizer #373

coderxiang commented Mar 2, 2015

dlwh commented Mar 2, 2015

dbtsai commented Mar 3, 2015

dlwh commented Mar 3, 2015

dbtsai commented Mar 3, 2015

dlwh commented Mar 3, 2015

dlwh commented Mar 3, 2015

dbtsai commented Mar 3, 2015

dlwh commented Mar 3, 2015

expose improvementTol inside FirstOrderMinimizer #373

expose improvementTol inside FirstOrderMinimizer #373

Conversation

coderxiang commented Mar 2, 2015

dlwh commented Mar 2, 2015

dbtsai commented Mar 3, 2015

dlwh commented Mar 3, 2015

dbtsai commented Mar 3, 2015

dlwh commented Mar 3, 2015

dlwh commented Mar 3, 2015

dbtsai commented Mar 3, 2015

dlwh commented Mar 3, 2015