Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LU solver documentation #890

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from
Draft

Conversation

mgovers
Copy link
Member

@mgovers mgovers commented Feb 7, 2025

  • Add LU solver documentation
    • Rationale
    • Implementation

@mgovers mgovers added the documentation Improvements or additions to documentation label Feb 7, 2025
@mgovers mgovers changed the base branch from main to feature/grid-graphs-with-tikz February 7, 2025 10:45
Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
@mgovers mgovers force-pushed the feature/lu-solver-documentation branch from 78c6fe8 to 794e15e Compare February 7, 2025 10:48
@mgovers mgovers changed the base branch from feature/grid-graphs-with-tikz to main February 7, 2025 10:48
@mgovers mgovers self-assigned this Feb 10, 2025
Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
@mgovers
Copy link
Member Author

mgovers commented Feb 13, 2025

@TonyXiang8787 this is ready for a first review round. I will spend some time tomorrow on going through all the documentation from start to finish myself as well to check if the structure still makes sense, but some feedback would be nice as well at this stage.

docs/advanced_documentation/algorithms/lu-solver.md Outdated Show resolved Hide resolved
Comment on lines 488 to 489
Because the backward error is calculated on the $x$ and $r$ from the previous iteration, the
iterative refinement loop will always be executed twice.
Copy link
Member Author

@mgovers mgovers Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a potential follow-up improvement to the LU solver implementation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand the logic here? What do you mean twice? The first time is just the solution of the original formula $Ax=b$ right? Because you initialize $r=b$. So you do only one actual iterative refinement.

Copy link
Member Author

@mgovers mgovers Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we always do one actual iterative refinement, but in the limit where $\epsilon_{backwarderror} \to \infty$, iterative refinement is never needed. If the threshold is set to infinity, it should break after the first time it goes through the loop, but in the current implementation, it will always do another solve-and-refine step

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still do not understand your point here. So we now do the following initialization

  1. berr <- inf.
  2. r = b

The loop is (ignore the max limit)

while (berr > epsilon_converge)

  1. solve A*dx = r
  2. update berr, and update x
  3. update r

So

  1. In the first iteration, we are just solving Ax=b, which is the original equation. The berr will be very big. This should not be seen as iterative refinement.
  2. In the second iteration, we do one iterative refinement. And we get the new berr.
  3. If now the new berr is within the threshold, the loop terminates.

We always do ONE iterative refinement. But you mentioned in the doc that "iterative refinement will always be twice".

Comment on lines 257 to 259
Instead of calculating $\underline{m}_p^{-1}$ to obtain $\underline{m}_p^{-1} x_p$, the power grid
model first solves the matrix equation $l_p y_p = p_p a_p$, followed by solving $u_p z_p = y_p$. The
end-result is then $\underline{m}_p^{-1} x_p = q_p z_p$.
Copy link
Member Author

@mgovers mgovers Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another potential follow-up improvement to the LU solver implementation:

Because matrix inversion is equivalent to solving this set of equations on a matrix once, it actually may be faster to pre-calculate and then left multiply each block in the above by $\underline{m}^{-1}$ iff the row contains more than one element (which is pretty much always the case), and the difference in performance increases as the amount of off-diagonal elements increases.

In addition, it may make the code simpler. (TODO: check that that solution would be numerically stable)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the very first prototype of the LU Solver. It quickly proved to be extremely numerically not statable. Calculating inverse explicitly you lose already half of the precision. Applying that inverse to other off-diagonal blocks makes the result much less accurate and leads to divergence of outer loop (NRPF).

Have a look at the original PR, especially this commit

08ee7fb

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can explicitly document now in the docs to state applying inverse of diagonal blocks will result in numerical instability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing I have tried is to use EIGEN build-in function to do the triangular solver instead of writing them manually. However, the benchmark was not so convincing. For some reason the EIGEN triangular solver is much slower than manually written stuff. Maybe this can be an investigation to understand why.

2cfe09b

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the very first prototype of the LU Solver. It quickly proved to be extremely numerically not statable. Calculating inverse explicitly you lose already half of the precision. Applying that inverse to other off-diagonal blocks makes the result much less accurate and leads to divergence of outer loop (NRPF).

Have a look at the original PR, especially this commit

08ee7fb

You can explicitly document now in the docs to state applying inverse of diagonal blocks will result in numerical instability.

Right. Thanks, that's useful input! Done.

Another thing I have tried is to use EIGEN build-in function to do the triangular solver instead of writing them manually. However, the benchmark was not so convincing. For some reason the EIGEN triangular solver is much slower than manually written stuff. Maybe this can be an investigation to understand why.

2cfe09b

do you want me to document that we have done a quick investigation and ran into a dead-end? I don't think it's worth investigating at this time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to mention anything in the docs. This is minor implementation choice.

We can later in the future to further benchmark why EIGEN is slower.

mgovers and others added 3 commits February 14, 2025 08:32
Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
#### Dense LU-factorization process

The Gaussian elimination process itself is as usual. Let
$M_p\equiv\begin{bmatrix}m_p && \vec{r}_p^T \\ \vec{q}_p && \hat{M}_p\end{bmatrix}$, where $p$ is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for me an unusual way of describing the problem? Usually ,we are talking about 9 blocks in the process. But you only mentioned 4.

[
[a11, a12, a13],
[a21, a22, a23],
[a31, a32, a33],
]

Where
a11: block containing handled L/U combination where the pivots have been already handled
a12: already handled U part just above the current pivot
a13: already handled U part above the rest pivots to be handled
a21: already handled L part just in the left of current pivot
a22: current pivot
a23: current U part
a31: already handled L part in the left of the rest pivots to be handled
a32: current L part
a33: rest of the matrix which will be adjusted by current elimination process

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean i mentioned 4 blocks? i described the process for arbitrary block size. sym/asym is one thing, but the specific solvers can also add more dimensions to the block size. E.g. for NRPF, we treat the P and Q components separately, so it's not 3x3 but 9x9, right?

Since the process is iterative, you can step in at any pivot element $p$ and treat the rest as described until the next iteration $p$ needs to start.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, we should not omit the part which have been already factorized. We have to show the entire matrix in place, including the upper and left part of the matrix which already have the factorized L/U values.

Copy link
Member

@TonyXiang8787 TonyXiang8787 Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you at pivot $k$, we should also show the rows and columns of $0 \to k-1$.


The Gaussian elimination process itself is as usual. Completely analogously to and following the
same conventions as [before](#dense-lu-factorization-process), let
$\underline{M}_p\equiv\begin{bmatrix}\underline{m}_p && \vec{\underline{r}}_p^T \\ \vec{\underline{q}}_p && \hat{\underline{M}}_p\end{bmatrix}$
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why do you use underscore? Underscore in EE means specifically complex numbers.
  2. Same comment as above. Using 4 blocks instead of 9 blocks is confusing for the readers. Reader might think this is only applicable for the first pivot.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think i now understand what you mean. I'm describing the process at some point in the iteration during treatment of pivot element $p$.

I can indeed extend it, but the $p-1$ row and column do not contribute to this part of the process. What do you recommend?

Copy link
Member

@TonyXiang8787 TonyXiang8787 Feb 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The $0 \to p-1$ does contribute in the process. If you permute columns/rows to get the max pivot for element $p$. The previous columns and rows have to be permuted as well.

Also, for the reader, include previous rows/columns shows that the factorization is in the middle of the process instead of just the beginning.

Comment on lines 236 to 241
Here, $\overrightarrow{\underline{m}_p^{-1}\underline{q}_p}$ is symbolic notation for the
block-vector of solutions of the equation $\underline{m}_p x_{k;p} = \underline{q}_{k;p}$, where
$k = 0..(p-1)$. Similarly, $\widehat{\underline{m}_p^{-1}\underline{q}_p \underline{r}_p^T}$ is
symbolic notation for the block-matrix of solutions of the equation
$\underline{m}_p^{-1} x_{k,l;p} = \underline{q}_{k,p} \underline{r}_{p,l}^T$, where
$k,l = 0..(p-1)$. That is:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not correct and was actually one of the attempts in the prototyping. It did not work and produced the same numerical instability as doing the explicit inverse.

The real implementation now, is we do not fully solve $x_{k;p}$, but only apply the two separate triangular solve in the current L and U part.

We apply lower-triangular solve in the U part which is right to the current pivot:

// for block matrix
// calculate U blocks in the right of the pivot, in-place
// L_pivot * U_pivot,k = P_pivot * A_pivot,k k > pivot
if constexpr (is_block) {
for (Idx u_idx = pivot_idx + 1; u_idx < row_indptr[pivot_row_col + 1]; ++u_idx) {
Tensor& u = lu_matrix[u_idx];
// permutation
u = (block_perm.p * u.matrix()).array();
// forward substitution, per row in u
for (Idx block_row = 0; block_row < block_size; ++block_row) {
for (Idx block_col = 0; block_col < block_row; ++block_col) {
// forward substract
u.row(block_row) -= pivot(block_row, block_col) * u.row(block_col);
}
}
}
}

We then apply upper-triangular solver in the L part which is below the current pivot:

if constexpr (is_block) {
// for block matrix
// calculate L blocks below the pivot, in-place
// L_k,pivot * U_pivot = A_k_pivot * Q_pivot k > pivot
Tensor& l = lu_matrix[l_idx];
// permutation
l = (l.matrix() * block_perm.q).array();
// forward substitution, per column in l
// l0 = [l00, l10]^T
// l1 = [l01, l11]^T
// l = [l0, l1]
// a = [a0, a1]
// u = [[u00, u01]
// [0 , u11]]
// l * u = a
// l0 * u00 = a0
// l0 * u01 + l1 * u11 = a1
for (Idx block_col = 0; block_col < block_size; ++block_col) {
for (Idx block_row = 0; block_row < block_col; ++block_row) {
l.col(block_col) -= pivot(block_row, block_col) * l.col(block_row);
}
// divide diagonal
l.col(block_col) = l.col(block_col) / pivot(block_col, block_col);
}
} else {

This is happening in the whole LU factorization part.

In the solve part, we do the rest lower/upper triangular solve on the actual b and y.

Signed-off-by: Martijn Govers <Martijn.Govers@Alliander.com>
@mgovers mgovers force-pushed the feature/lu-solver-documentation branch from 53051da to f2551a8 Compare February 14, 2025 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants