-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eigensolver support in Ginkgo #135
Comments
Here is my rough idea for now, given the information I have. Since an eigensolver produces a (partial) eigendecomposition, we could treat it the same way as we plan to treat other decompositions. It could be a The idea of Here is a copy of the discussion from #133, where I explain it in a bit more detail, and raise some questions that should be answered:
|
Thanks for taking the time to write all of this, this is useful resources I will need more time to proceed it properly. I wanted to mention that this book may be of interest: https://www-users.cs.umn.edu/~saad/eig_book_2ndEd.pdf, some methods you exposed can be found here (notably in the chapter 4). |
Btw, everyone should feel free to edit the description of the issue however they see fit (think of it as a wiki page). The idea of it is to provide a general introduction to the topic, and point out the design issues, so people can follow the discussion in the comments. The comments can be used to propose and discuss possible designs. |
For the different algorithms, I think ARPACK is a pretty good source. The original library is in FORTRAN, but they have a beta version, ARPACK++ in templated C++. |
Do they have support for different architectures, e.g. GPUs? EDIT: |
Did they forget to update the update date? 😕 |
Hm, they dont seem to have a GPU version. I think that is the actual last update, but I am not sure. the user guide says 2001. :| |
For FEAST and contour integral based eigensolver, this paper shows the relationship among the several contour integral based eigensolvers. For using contour integral based eigensolver, we need to estimate the number of eigenpairs. |
For the application part, the sparse eigensolver can be used on the organic material simulations. Numerical aspect of large-scale electronic state calculation for flexible device material |
I add the needed subroutines from FEAST or block SS-RR here.
The sparse matrix axpy is maybe not needed because it only needs the operation when solving the linear system. Defining a LinOp which represents zB-A is also a workable route. |
I would consider the eigensolver like linear system solvers.
|
Humm... not yet clear to me. If we want to express an eigensolver as a linear operator x = A(b), we would need the eigenvalues (b) before computing the eigenvectors? |
oh, I forget the linear solver will take the b as the input. |
How about do we create an Eigensolver class with stop criterion? |
I did not quite follow the whole Issue (and have probably the least amount of knowledge about Eigensolvers from all of us), but from what I understand, this Issue was created to discuss the interface.
This is at least how I understand his comment, and to me, it also looks like a nice and intuitive interface. |
thank @thoasm Should
|
Do you only want to generate the Eigenvectors, or also the Eigenvalues? If you are fine with returning the result (I think that returning a Just so we are on the same page, a |
|
That actually is a valid point. Someone may use an Eigensolver as a preconditioner, and the code will compile without problems. We may think of having a different class, maybe... probably @pratikvn @tcojean @thoasm @gflegar can all give their opinion? |
I definitively agree that it should not be a Currently, I think that a For that, we probably need a Diagonal matrix format to store the Eigenvalues efficiently, but I think that is fairly straight forward (might need some time to implement however, since there are a lot of conversions to do). However, I am against forcing the Eigensolver to be a LinOp while it can not act as one, that can lead to confusion to the user and result in very annoying errors, for example if somebody wants to use it as a preconditioner. Edit note: Corrected wrong |
I agree with @thoasm's last comment. Making an eigensolver a As I suggested previously, it may be a good idea to represent an eigensolver as a
In conclusion, if there is no real use-cases for having eigendecompositions as a |
@nghiakvnvsd We are still discussing the eigensolver structure/interface.
Usage:
We are stilling working on it, so this interface might not be the last version. |
Introduction
This is an issue to discuss how eigensolvers could be added to Ginkgo.
In general, eigensolvers compute an eigendecomposition A Q = Q Λ (or equivalently A = Q Λ Q-1) of a matrix A, where Q is a square matrix whose columns are the eigenvectors, and Λ is a diagonal matrix containing the eigenvalues of the matrix.
The basics
Here is a reminder of some basic linear algebra concerning the eigendecomposition, which might have to be considered in the discussion (please add anything that is missing and should be considered):
Numerical methods
The first step for deciding on the interface is to see what the methods actually do; what inputs they need, what outputs they provide, and how they compute it.
If anyone knows a good online reference of various methods, please link it here, I don't know of one in English. I'll try to quickly go over the basic methods (using my very limited understanding of eigenvalue algorithms) to give a starting point.
While writing this section I found that there is Templates for the Solution of Algebraic Eigenvalue Problems. Not sure how good it is, probably the same style as the templates for linear system. It's in HTML format, so not very pleasant to read, and I did not find a (non-shady) PDF available.
Rayleigh quotient
The first observation that can be made is that it's all about the eigenvectors.
If we know that q is an eigenvector for A, even if the corresponding eigenvalue λ is not known it can easily be computed from the equation Aq = λq. Multiplying the entire thing by q* and dividing the result with q*q gives us the formula λ = (q*Aq) / (q*q) (which is just SpMV + 2 dot products). Note that the quantity R(A, q) = (q*Aq) / (q*q) is meaningful even if q is not an eigenvector, and is called the Rayleigh quotient.
In the context of eigendecomposition algorithms to approximate the eigenvalue from an approximation of the eigenvector.
Power iteration
The power iteration is probably the simplest of all, and can only compute the eigenvector associated to the largest by magnitude eigenvalue of the matrix.
The method is basically a fixed-point iteration with f(q) = Aq / || A q ||:
Very intuitive proof of convergence: The linear operator A = Q λ Q-1 acts on any vector as "scaling" along directions q1, ..., qn by λ1, ..., λn, respectively. Assuming λ1 is the unique(!) largest eigenvalue by magnitude, the scaling of the vector in that direction will be larger than in any other directions, thus the contribution of the component in the direction q1 will increase relative to other directions. After enough iterations, all other component will die off, and we're left with q1. Notice how this doesn't work if there are multiple eigenvalues of the largest magnitude, or if the initial guess does not have a component in direction q1. The convergence rate is determined by the ratio between the largest, and the second largest eigenvalue.
Design questions
Q1: What can
has_converged
be, and how do we pass it? One possibility is || Aq - R(A,q) q || < tol * R(A, q) for some tolerance level tol. Is this very distinct from stopping criteria we have for solvers? Are there cases where the eigensolver should have a stopping criterion of a solver (and is there a meaningful definition of rhs b), or vice-versa, a linear system solver which should stop when a good enough approximation of an eigenvector (of something) is found?Q2: This method only computes one eigenvector, not the full decomposition. How do we represent this?
(Shifted) inverse iteration
If we apply the power method to A-1, we get the eigenvector (of A-1) associated to the largest eigenvalue 1 / λn of A-1. Directly from the definition it follows that that same eigenvector is the eigenvector of A associated to its smallest eigenvalue λn. Of course A-1 is never formed explicitly, but its application is realized as a solution of a linear system.
That's largest, and smallest eigenvalues. The next step is to look into the operator A - zI (where z is a scalar). Its smallest eigenvalue is the smallest among numbers λi - z (i.e. the λi closest to z), and the associated eigenvector once again is also the eigenvector of A associated to λi. Thus, applying the power method to (A - zI)-1 (solving a shifted system at each step) results in an eigenvector associated to the unique(!) closest eigenvalue to z. This method is know as inverse iteration. There's also a related method called Rayleigh quotient iteration, which replaces z with the Rayleigh quotient of the current approximation at each step.
There is also an example implementation of this method in Ginkgo, under
examples/inverse_iteration
.Design questions
Q3: The stopping criterion for inverse iteration can be expressed in terms of the original problem, as in "how far is my approximation from an eigenvector of A", or in terms of the problem we are actually solving, as in "how far is my approximation from an eigenvector of (A - zI)-1". Do both of these criteria make sense? Is there even a difference between the two, or do they always give the same answer?
Q4: Would we even need a separate implementation of the inverse iteration method, or could we just express it as the power method applied to the operator (A - zI)-1? We already have the possibility to form this operator in Ginkgo: the shift can be expressed using
gko::Combined
, and the inverse with any solver.Q5: Depending on the other answers, do we need the possibility to have a method that operates on one operator, but uses a stopping criterion associated to another operator? If we implement inverse iteration in terms of the power method, the operator used for computations is (A - zI)-1, and the operator used for the stopping criterion might be A.
Subspace (or orthogonal) iteration
This is another improvement of the power method, which can compute the eigenvectors associated to the largest p eigenvalues of A. The idea is to iterate a p-dimensional subspace, instead of just a single vector, which will then converge towards an invariant subspace of A. The basis has to be re-orthogonalized at each step, as otherwise all the vectors will converge towards the largest eigenvalue.
Further refinement of the method in the dense case for p = n results in the QR algorithm and implicit QR algorithm. Don't know if something similar can be done in the sparse case.
What do more advanced methods (especially for sparse matrices) look like? It would be great if someone could give general description of them, so we can discuss possible designs.
FEAST
This is a method I'm looking into right now, but not yet in a lot of detail. What I can contribute here is that it needs to solve shifted systems in each iteration, for different shifts. What it also needs is numerical integration, and a dense eigensolver, as it reduces a large sparse problem, to a small dense problem.
Design questions
Q6: What do we do with all the extra functionalities that are needed for some methods? Do we implement everything ourselves? This would basically mean that at some point we would need our own implementation of the whole BLAS and LAPACK, with optimized kernels for everything. We would also need to create a sub-package for numerical integration, etc. If we use third-party software, what do we use? What do we use for different executors? Note that this is not only tied to eigensolvers, but also to linear solvers. For example, GMRES already needs some orthogonal transformation, which cold be provided by another library to remove the burden from us. I assume we'll have more advanced methods where this will become a necessity.
Q7: If we decide we do need other libraries, what do we do with support for user-defined value types? Are there libraries that have this possibility we can use? If not, what should we do?
Applications
I would really want to have something in this section, but my knowledge here is even worse than in the previous section. How are eigensolvers used in practice? It should affect the design, as we want to provide an interface that provides the results in a format that feels natural for applications.
If someone could contribute this section, or knows someone who can contribute to this section, that would be great...
The text was updated successfully, but these errors were encountered: