feat: Add Bunch–Kaufman LDL Decomposition#1591
feat: Add Bunch–Kaufman LDL Decomposition#1591awinick wants to merge 12 commits intodimforge:mainfrom
Conversation
|
This work was inspired in part by the implementation efforts in #1515 by @ajefweiss. For the problem class I was working on, I needed clear and reliable handling of indefinite Hermitian matrices, which motivated implementing the classical Bunch–Kaufman LDL factorization. While there is overlap in goals, this PR focuses specifically on the symmetric-indefinite Hermitian case with explicit pivot handling and block-diagonal |
|
I'll try to take a look soon ish. I've kicked off the checks and they show a few problems in the derive macros. Could you fix them if you have time? |
geo-ant
left a comment
There was a problem hiding this comment.
Hey, I'm sorry that I didn't have time to take a super in-depth look yet. I need to brush up on the Bunch-Kaufman factorization.
Did you use LAPACKS zhetrf function as your reference? If not, what other references should I use to review the implementation. I'm aware of:
- Bunch & Kaufmann 77, which present a couple of algorithms. Which one should I be looking at?
- Jones & Patrick 89, which present an algorithm in section 4 of their paper.
Let me know what would be a good reference for me to check your implementation. I see you've added tests which is already fantastic, I'd just like to be able to review your implementation as well.
I've made some comments in the code for some API related things. If you want to take care of them, could you also run cargo fmt on your code so it'll pass the lint?
The implementation is based on the partial pivoting diagonal pivoting method from Bunch & Kaufman (1977), specifically the algorithm described in Section 2 (referred to as “Algorithm A”). This is also the basis for LAPACK’s ?sytrf/?hetrf routines, so the overall structure and pivoting logic are very similar to those. Happy to clarify any specific part of the algorithm if helpful. |
geo-ant
left a comment
There was a problem hiding this comment.
Hey, I haven't yet reviewed the implementation itself. But that's the last thing I need to do. I hope to do that soon (for some definition of soon). In the meantime I've taken a look at the other changes. This is shaping up really nicely, but I did spot some things in the tests that I believe should be addressed. Feel free to address the changes now (or whenever you have time) or to wait for the review of the implementation itself. And thanks for your continued work on this!!
| for _ in 0..10 { | ||
| let matrix: DMatrix<Complex<f64>> = random_hermitian_from_diag(&diag, &mut rng); | ||
| let ldl = matrix.clone().ldl(); | ||
| assert!(relative_norm(&matrix, &reconstruct(&ldl)) < 1e-12); |
There was a problem hiding this comment.
why can't you use assert_relative_eq! here? It's not that this looks super problematic, but I feel element wise differences are probably safer, even if the epsilon needs to be a little more generous. Otherwise a noticeable difference in one element could be offset by a large enough matrix norm. Please correct me, I might absolutely be missing something here.
There was a problem hiding this comment.
We could certainly use assert_relative_eq! for the alternating-spectrum case. I’d lean toward leaving the reconstruction check normwise in both tests to keep the interpretation uniform, but I could change that one if you prefer.
For the geometric-spectrum case, though, I think a normwise check is the more appropriate criterion. This test is deliberately badly scaled, so a fixed absolute entrywise comparison is not very informative. In practice, making assert_relative_eq reliable there would require loosening it to roughly epsilon = 1e-6, which feels too weak to say much about the quality of the factorization.
More importantly, for Bunch–Kaufman the natural lens is normwise backward stability rather than uniformly small entrywise errors. What I want to check in this test is essentially that the computed factors reconstruct a nearby matrix with small relative error, i.e. that ‖A - LBLᴴ‖ / ‖A‖ is small. That is the stability notion usually associated with this factorization 1, whereas a fixed absolute elementwise tolerance is much more sensitive to the scaling of individual entries.
| ); | ||
|
|
||
| let b = random_isometry(Dyn(n), Dyn(n), &mut rng); | ||
| assert!((&matrix * &ldl.solve(&b).unwrap() - &b).norm() / matrix.norm() < 1e-12); |
There was a problem hiding this comment.
You want to check (in spirit) if
If
There was a problem hiding this comment.
I think this is a good point in the abstract, but for these tests I don’t think the normal-equations check is the right replacement.
In the alternating-spectrum case, the matrix is nonsingular by construction, so checking AX ≈ B directly is the more natural condition, and it is strictly stronger than checking Aᴴ(AX - B) ≈ 0.
For the geometric-spectrum case, I agree the matrix can be extremely ill-conditioned in floating point, but this implementation also does not intentionally compute a least-squares solution for singular systems. It only reports failure when a pivot is exactly zero. So I don’t think the right interpretation of a successful solve here is “the least-squares normal equations are satisfied”; rather, the intent is still to check that the returned solution gives a small residual.
So my preference would be to keep the solve check as a direct residual / AX ≈ B check, and reserve the normal-equations condition for a solver that is explicitly intended to return least-squares solutions in the singular case.
There was a problem hiding this comment.
Hey, I'm really sorry I'm being so annoying here and I'm also reasonably confident that everything you're saying is correct, but I want to make sure we're not making a mistake here. You're right that the paper introduces the stability using
In Table 2.4 they introduce the decomposition error and the solution error as
and
With the condition number Matrix::norm function, you are calculating the Frobenius Norm of a the respective Matrix, which isn't the L2-norm (spectral norm of a matrix) despite the nalgebra documentation claiming that it is.
I think your hard coded bounds are fine, rather than using
Also your test for the accuracy of the solution doesn't quite reproduce the criterion in the paper. You're testing Vector::norm, since the F-norm and the 2-norm are the same for vectors. It'll still be nice to have a test in there for multiple right hand sides, but that might be a simple manually verified unit test, just to make sure the functionality works.
Please let me know what you think.
There was a problem hiding this comment.
Thanks again for the detailed feedback.
I’ve updated the tests to use the induced matrix 1-norm for both the reconstruction and solve checks, which aligns better with the analysis in the paper. While all matrix norms in finite dimensions are equivalent up to constant factors, for fixed tolerances that equivalence isn’t numerically neutral, so it makes sense to match the norm used in the bounds.
This did require adding a OneNorm implementation. We already had a local version of this internally for the matrix exponential code, so I moved it into the core norms API since it’s generally useful. I’m also planning to add exp_multiply and matrix logarithm functionality, which both rely on the 1-norm, so this felt like a natural addition rather than a one-off for these tests.
For the solve accuracy, I’ve kept the residual-based check (‖AX − B‖ small). In these tests we don’t have access to the true solution x*, so measuring ‖x − x*‖ directly isn’t possible in general. The residual is the standard practical proxy here, and using multiple right-hand sides (sampling B as an n×n isometry) exercises the solve across a well-conditioned basis rather than a single vector.
More generally, there isn’t a single canonical way to assess solver accuracy in this setting. There are several equivalent formulations depending on what is being emphasized. In this case, I don’t see a strong reason to prefer one beyond aligning with the analysis and using a practical, well-established proxy for the solve.
There was a problem hiding this comment.
I didn’t add a dedicated unit test for OneNorm. The implementation is quite direct, and it’s already exercised indirectly by the updated tests as well as existing matrix exponential tests that rely on the same computation.
That said, I’m happy to add a small explicit test if you think it would be useful.
Moved pivot assertions into unit test
geo-ant
left a comment
There was a problem hiding this comment.
okay. I think we're in a good place. If you could respond to the other questions, this should be the last round.
| .fold(T::SimdRealField::zero(), T::SimdRealField::simd_max) | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
I'm in favor of having this in here, but I'll page the owner of the library privately, since I don't feel comfortable deciding this alone. The problem is that the current LpNorm implementation looks deceptively similar and the struct is poorly documented. What the current LpNorm calculates is the elementwise Lp norm and there should be a comment explaining this and giving a formula. Your documentation is much better, but still relies on everyone understanding what induced norm means. Would you be so kind to add some docs to both?
There was a problem hiding this comment.
I've updated the docs for both as requested, actually updated docs for all the norms with warnings when needed.
There's a meaningful inconsistency in how they all behave on matrices vs. vectors. In addition to what you said about LpNorm, UniformNorm has the same issue. EuclideanNorm on a matrix computes the Frobenius norm, not the spectral norm.
Making them align with user expectation would be a breaking change but would be cool to see at some point maybe haha.
geo-ant
left a comment
There was a problem hiding this comment.
Thank you for your work. @awinick if you would update the comments on the norms I'd be very grateful. I'll also ping the owner of the library privately, because I'd like them to weigh in on the norms. Everything else looks great to me, thanks for your patience!
|
I just saw that some of your changes are incompatible with the MSRV, could you quickly address those? |
This PR adds an LDL decomposition for Hermitian matrices using the Bunch–Kaufman symmetric pivoting algorithm.
The factorization computed is
where
This corresponds to the class of algorithms implemented in LAPACK’s *HETRF family.
Implementation notes
The factorization is stored in-place in a single matrix following the standard LAPACK layout:
Tests
The tests are intentionally compact but exercise a wide variety of pivot behaviors.
During development I inspected pivot structures and found that matrices constructed as
where U is Haar-random unitary and D has alternating signs consistently generate rich mixtures of 1×1 and 2×2 pivot blocks.
Two spectra are used.
Alternating ±1 spectrum
This produces a diverse set of pivot patterns across matrix sizes and verifies
Alternating geometric spectrum
This stresses the algorithm under extreme scaling, which is particularly important for
Empirically this spectrum produces a wide variety of pivot configurations while remaining numerically well-behaved.
Determinant accuracy
During testing it was observed that computing determinants through this LDL factorization produces significantly more accurate real-valued results for Hermitian matrices than the current generic approach.
Because the decomposition explicitly respects Hermitian structure, the determinant is obtained as the product of real block determinants, which reduces complex roundoff artifacts.
Benchmarking
Benchmarks were performed comparing this implementation against LAPACK (
zhetrf) and the existing QR decomposition in nalgebra.All results below correspond to 1000 factorizations.
The pure Rust implementation is typically within roughly a factor of two of hardware-accelerated LAPACK. Given that LAPACK implementations rely on highly tuned BLAS kernels and architecture-specific optimizations, this level of performance is reasonable for a portable pure-Rust implementation.
For comparison, QR factorization, the current best available method, is substantially slower. QR does not exploit Hermitian structure and therefore performs significantly more work than an LDL factorization.