add extra exceptional shift to solve rare convergence issues #477

thijssteel · 2021-01-26T12:53:03Z

closes #475

This PR adds an extra exceptional shift choice to the complex implementations of the QZ algorithm.
As is tradition, it has been chosen for no particularly good reason.
Also adds some overflow protection to the original exceptional shift for good measure.

pablosanjose · 2021-01-26T13:55:47Z

Thanks! Note that in #475 the problem never arose when using single-precision complex numbers, only in the double precision case. Can you comment on how robust you expect the fix to be? Does precision suffer? Should we run more extensive tests to see if the extra shift strategy is generally sound?

codecov · 2021-01-26T14:00:36Z

Codecov Report

Merging #477 (6bb0869) into master (2b9f0f5) will decrease coverage by 0.00%.
The diff coverage is 94.73%.

@@            Coverage Diff             @@
##           master     #477      +/-   ##
==========================================
- Coverage   83.23%   83.23%   -0.01%     
==========================================
  Files        1820     1820              
  Lines      170847   170871      +24     
==========================================
+ Hits       142199   142216      +17     
- Misses      28648    28655       +7

Impacted Files	Coverage Δ
SRC/chgeqz.f	`92.80% <94.73%> (-0.05%)`	⬇️
SRC/zhgeqz.f	`90.75% <94.73%> (-0.68%)`	⬇️
SRC/zggev3.f	`92.50% <0.00%> (-1.88%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2b9f0f5...6bb0869. Read the comment docs.

thijssteel · 2021-01-26T14:03:53Z

Precision should not suffer at all. This is only the shift strategy which does not introduce any roundoff.
If anything, faster convergence will increase the precision because less iterations introduce less roundoff.

Even if the problem never arose in single precision, i still think we should keep the implementations as close as possible.

I expect this solution to be fairly robust, but more extensive tests never hurt. You mentioned that you run into #475 a lot, so it would be great to test this PR against some more pencils to be sure.

thijssteel · 2021-01-26T14:13:30Z

Ok, so now the coverage is complaining because the convergence never fails and the exception handling is never run ...

I vote we just ignore it. I can't find any failing pencils.

pablosanjose · 2021-01-26T18:32:03Z

I'd like to find a way to try this PR within my Julia code, to try finding more failures, but I'm not sure it is possible until this trickles down to OpenBLAS and then to Julia... Thing is, the package that produces these matrices is Julia-only.

(I guess one thing I could do is try to collect a bunch of failing examples with Julia and use your script in #475 to see if they all converge using this PR.)

thijssteel · 2021-01-26T21:25:51Z

It's possible to compile julia yourself and link with a preinstalled BLAS/LAPACK library instead of OpenBLAS. See this variable

but this is not an officially supported feature => YMMV.

langou · 2021-01-26T22:55:33Z

Hi all,

Thjis: terrific work, oh my. This is cool.

Here are two webpages with some history about this:
http://www.netlib.org/lapack/Errata/old-vrac/lapack_known_issues.html
http://www.netlib.org/lapack/bug_list.html#_strong_span_class_green_bug0102_span_strong_convergence_problem_with_some_particular_pencils_in_xggev
The webpages are quite outdated.

Julien.

langou · 2021-01-26T23:05:33Z

Ok, so now the coverage is complaining because the convergence never fails and the exception handling is never run ... I vote we just ignore it. I can't find any failing pencils. => I find this hilarious. I agree with ignoring the fact that the coverage is complaining because the exception handling is never run. Indeed. Hilarious though.

pablosanjose · 2021-01-27T07:46:32Z

The USE_SYSTEM_LAPACK didn't work for me after a few tries. I'm now wondering whether the best way for me to test this would be to compile LAPACK into a dynamic library, which I can then call from Julia using ccall (perhaps?). I'm a total noob with this kind of thing, could somebody orient me on how to build lapack into a .dylib?
EDIT: I'm on macos, by the way

thijssteel · 2021-01-27T08:38:12Z

@langou thanks for the interesting history. It makes me doubt my exceptional shift strategy though. I'll have some more fun trying to think up better alternatives.

@pablosanjose i'm afraid that i am but a lowly mathematician. The complexities of dynamic linking are beyond me and it seems LAPACK doesn't support it out of the box. I think your best bet is to clone OpenBLAS and apply my patch in the lapack-netlib folder. Then you can compile to a .dylib. You might have to disable multithreading though or you'll also need pthread.

pablosanjose · 2021-01-27T08:41:07Z

Thanks @thijssteel, that's just what I'm trying to do now!

pablosanjose · 2021-01-30T12:48:03Z

@MigMuc noted in OpenMathLib/OpenBLAS#3091 that in zhgeqz.f, there may be a typo: LSAME in line 329 should read LLSAME. Please check!

martin-frbg · 2021-01-30T15:23:07Z

btw. gfortran also warns about unused variabes "RTDISC" and "T1" and thinks ILZ,ILQ and ILSCHR may be uninitialised in chgeqz.f, if (and only if) you want to use the opportunity for a small cleanup.

weslleyspereira · 2021-02-03T14:08:36Z

@thijssteel, thanks for this bug fix! :)

Although the exceptional shift is chosen for no particularly good reason, could it be better to use

               ESHIFT = ESHIFT + ABI12

instead of

               ESHIFT = ESHIFT + ( ASCALE*H( ILAST,
     $            ILAST ) )/( BSCALE*T( ILAST, ILAST ) )

in SRC/zhgeqz.f ?

thijssteel · 2021-02-03T15:57:47Z

Not a bad idea, do note however that while this PR does add exceptional shifts, the main problem was the calculation of the shifts. For example, none of the tests ever even seem to use the second exceptional shift if you look at the test coverage. I think it's fine as is.

langou · 2021-02-09T17:32:31Z

Let us merge this in. Thanks @thijssteel! You're the best!

…value-convergence-failure add extra exceptional shift to solve rare convergence issues

thijssteel added 2 commits January 26, 2021 13:44

add extra exceptional shift to solve rare convergence issues

d789f2e

some extra overflow protection (better safe than sorry)

b8ab53f

remove unnecessary overflow check

441b21e

pablosanjose mentioned this pull request Jan 28, 2021

Convergence errors in ggev and gges with complex double element types #475

Closed

thijssteel added 2 commits January 29, 2021 09:17

fix error in calculation of shifts in complex single shift qz

4e89f98

fix type error in chgeqz

c0964cf

pablosanjose mentioned this pull request Jan 29, 2021

Make direct = false the default for SingleShot1D solver pablosanjose/Quantica.jl#153

Merged

fix typo in chgeqz

5a893d0

fix compiler warnings in qz code

6bb0869

pablosanjose mentioned this pull request Feb 1, 2021

Upgrade to OpenBLAS 0.3.13 JuliaLang/julia#39216

Merged

weslleyspereira mentioned this pull request Feb 1, 2021

recipe for target 'znep.out' failed error #474

Closed

langou merged commit e56b31e into Reference-LAPACK:master Feb 9, 2021

julielangou added this to the LAPACK 3.9.1 milestone Mar 25, 2021

martin-frbg mentioned this pull request Mar 30, 2021

OpenBLAS 0.3.13 ex-shift patch not working #525

Closed

christoph-conrads pushed a commit to christoph-conrads/lapack that referenced this pull request May 23, 2021

Merge pull request Reference-LAPACK#477 from thijssteel/fix-475-eigen…

52518d7

…value-convergence-failure add extra exceptional shift to solve rare convergence issues

andreasvarga mentioned this pull request Jan 12, 2022

eigvals fails on a pair of complex matrices of order 40 JuliaLang/LinearAlgebra.jl#838

Closed

martin-frbg mentioned this pull request May 7, 2023

LAPACK test fails with Error code from DDRGES3 = 9 on AMD Genoa OpenMathLib/OpenBLAS#4032

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add extra exceptional shift to solve rare convergence issues #477

add extra exceptional shift to solve rare convergence issues #477

thijssteel commented Jan 26, 2021

pablosanjose commented Jan 26, 2021

codecov bot commented Jan 26, 2021 •

edited

Loading

thijssteel commented Jan 26, 2021

thijssteel commented Jan 26, 2021

pablosanjose commented Jan 26, 2021

thijssteel commented Jan 26, 2021

langou commented Jan 26, 2021

langou commented Jan 26, 2021

pablosanjose commented Jan 27, 2021 •

edited

Loading

thijssteel commented Jan 27, 2021

pablosanjose commented Jan 27, 2021

pablosanjose commented Jan 30, 2021 •

edited

Loading

martin-frbg commented Jan 30, 2021

weslleyspereira commented Feb 3, 2021 •

edited

Loading

thijssteel commented Feb 3, 2021

langou commented Feb 9, 2021

add extra exceptional shift to solve rare convergence issues #477

add extra exceptional shift to solve rare convergence issues #477

Conversation

thijssteel commented Jan 26, 2021

pablosanjose commented Jan 26, 2021

codecov bot commented Jan 26, 2021 • edited Loading

Codecov Report

thijssteel commented Jan 26, 2021

thijssteel commented Jan 26, 2021

pablosanjose commented Jan 26, 2021

thijssteel commented Jan 26, 2021

langou commented Jan 26, 2021

langou commented Jan 26, 2021

pablosanjose commented Jan 27, 2021 • edited Loading

thijssteel commented Jan 27, 2021

pablosanjose commented Jan 27, 2021

pablosanjose commented Jan 30, 2021 • edited Loading

martin-frbg commented Jan 30, 2021

weslleyspereira commented Feb 3, 2021 • edited Loading

thijssteel commented Feb 3, 2021

langou commented Feb 9, 2021

codecov bot commented Jan 26, 2021 •

edited

Loading

pablosanjose commented Jan 27, 2021 •

edited

Loading

pablosanjose commented Jan 30, 2021 •

edited

Loading

weslleyspereira commented Feb 3, 2021 •

edited

Loading