Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with cvAcc for ENLR #67

Closed
sylvchev opened this issue Mar 13, 2020 · 1 comment
Closed

Error with cvAcc for ENLR #67

sylvchev opened this issue Mar 13, 2020 · 1 comment

Comments

@sylvchev
Copy link

Using cvAcc with ENLR, it may occur that lambda_min_ratio is wrongly estimated, resulting in an error from GLMNet (TaskFailedException: cannot specify both lambda and lambda_min_ratio). This could be avoided by passing the correct value of lambda_min_ratio.

This error is not raised when fitting directly ENLR, it is raised only with cvAcc.

Minimal example:

using PosDefManifoldML

PTr, PTe, yTr, yTe=gen2ClassData(20, 190, 38, 200, 40, 0.1)

cvAcc(ENLR(Fisher), PTr, yTr) # raise TaskFailedException
cvAcc(ENLR(Fisher), PTr, yTr; lambda_min_ratio=1e-4) # works
fit(ENLR(Fisher), PTr, yTr) # works also

My guess is that _getDim (in tools.jl) is not returning the same dimension as the dimension of the ℍVector, when projected in the tangent space (done in _getFeat_fit), resulting in this error message in GLMNet (https://github.com/JuliaStats/GLMNet.jl/blob/master/src/GLMNet.jl#L225)

Marco-Congedo added a commit that referenced this issue Mar 13, 2020
fixed issue "Error with cvAcc for ENLR" #67
@JuliaRegistrator register
@Marco-Congedo
Copy link
Owner

Marco-Congedo commented Mar 14, 2020

Hi,

thanks for spotting it. I fixed the bug. Your example exposed it as the number of observations is close to the number of variables in the tangent space.
What was going on is that GLMnet for the binomial dsitribution (but not for the others) actually uses two numbers to code each label, hence length() returns twice as much as the actual number of labels. In my fit function instead the labels are coded with a single number. Therefore, in the declaration of my fit function, argument

lambda_min_ratio :: Real = (length(yTr) < _getDim(𝐏Tr, vecRange) ? 1e-2 : 1e-4),

actually counted "half" of the labels counted by the GLMnet function i call to fit the model. This created the error from GLMnet you reported.

I fixed it by simply using argument

lambda_min_ratio :: Real = (length(yTr)*2 < _getDim(𝐏Tr, vecRange) ? 1e-2 : 1e-4), ,

which makes the declaration compatible with the GLMnet code. I wonder, however, wether the authors actually wanted to set the lambda_min_ratio argument counting the labels onle once, which seems more reasonable. If you think so, i may modify the code to oblige to this rule.

I have created v0.3.7 with the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants