Add level-3 BLAS triangular Sylvester equation solver #651

angsch · 2022-02-20T13:29:42Z

Description

Currently the only way to solve the triangular Sylvester equation AX + XB = C is via dtrsyl, which solely relies in level-2 BLAS. This PR intends to contribute a level-3 BLAS version.

The triangular Sylvester equation has been recognized to be prone to overflow. For that purpose, dtrsyl utilizes a scaling factor to represent the solution as (scale^{-1} * X) and solve the scaled equation AX + XB = scale * C. Due to the scaling factor, there is some flexibility in the representation of the solution. The proposed level-3 BLAS version computes the scaling factors based on upper bounds of blocks to enable level-3 BLAS. The scaling is typically slightly more aggressive so that an alternatively scaled final solution is computed. This is no problem as long as the scaling factor does not get flushed to zero. Indeed, the assumption in Anderson's original paper on solving scaled systems is that scale is in (0,1] and not [0,1].

In experiments the flushing occasionally happens for both dtrsyl and the proposed solver dtrsyl3, but dtrsyl3 reaches the stage earlier. In view of the intended changes of the handling of NaN, inf etc. I would be grateful to get some feedback on how to address the problem.

Our paper proposes to use integer scaling factors of the form 2^{(intscale} and solve AX + XB = 2^{intscale} * C. The vastly increased representational range rules out any flushing during the computation. The problem with the scaling factors affects dlatrs and dtrevc, too. There may be more routines. Is there an interest in using integer scaling factors and changing the function signature/documentation?

Alternatively, I would revise the PR and add a workaround for the flushing problem present in the attached code.

Once the decision on the scalings is taken, I will fix the other data types.

Checklist

The documentation has been updated.
If the PR solves a specific issue, it is set to be closed on merge.

codecov · 2022-08-02T17:56:05Z

Codecov Report

Base: 0.00% // Head: 0.00% // No change to project coverage 👍

Coverage data is based on head (d83c3fd) compared to base (801ac2f).
Patch coverage: 0.00% of modified lines in pull request are covered.

❗ Current head d83c3fd differs from pull request most recent head 970a772. Consider uploading reports for the commit 970a772 to get more accurate results

Additional details and impacted files

@@            Coverage Diff            @@
##           master     #651     +/-   ##
=========================================
  Coverage    0.00%    0.00%             
=========================================
  Files        1894     1904     +10     
  Lines      184078   186552   +2474     
=========================================
- Misses     184078   186552   +2474

Impacted Files	Coverage Δ
SRC/clatrs3.f	`0.00% <0.00%> (ø)`
SRC/ctrsyl3.f	`0.00% <0.00%> (ø)`
SRC/dlarmm.f	`0.00% <0.00%> (ø)`
SRC/dlatrs3.f	`0.00% <0.00%> (ø)`
SRC/dtrsyl3.f	`0.00% <0.00%> (ø)`
SRC/ilaenv.f	`0.00% <0.00%> (ø)`
SRC/slarmm.f	`0.00% <0.00%> (ø)`
SRC/slatrs3.f	`0.00% <0.00%> (ø)`
SRC/strsyl3.f	`0.00% <0.00%> (ø)`
SRC/zlatrs3.f	`0.00% <0.00%> (ø)`
... and 1 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Force compatibility with [ds]trsyl. Use two floating-point scaling factors (rather than integer scaling factors). This does not eliminate the problem that scalings can be flushed, making any result useless. That problem could be eliminated by replacing the floating-point scale factor with an integer scale factor.

The tests of ztrsyl via zget35 are on tiny matrices that fall into the unblocked section of ztrsyl3. Add a new test file that checks ztrsyl(3) for larger matrices and their compatibility: Every problem that is solvable by ztrsyl must be solvable by ztrsyl3.

An extensions of LATRS solving T*x = s*x to many right-hand sides, allowing the usage of BLAS-3. The new algorithm tends to use less aggressive scaling, in particular for larger matrices.

angsch · 2022-09-25T15:26:33Z

DLATRS3
Performance results originally presented at the BLIS retreat, and augmented with results for reference BLAS

DTRSYL3

The two solvers are a drop-in replacement of the existing BLAS-2 routines and dedicate quite a bit of code to computing a similar result as the BLAS-2 counterpart.The question on whether or not the scale factor should be of the same data type as the other data structures remains. The advantage of moving to the integer scale factor alpha = 2**j is that flushing of the scale factor to zero can no longer happen as a result of limitations of the floating-point range.

I would keep the representation by the BLAS-2 and the BLAS-3 version compatible. Then the disadvantage is a change of the function signature of LATRS and TRSYL. One option is to revisit the code later when an upgrade to LAPACK 4.0 is due and then switch the representation of the solution from (1/scale) * x to (1/(2**scale)) * x,where scale is int, later.

angsch force-pushed the master branch from 0fb2eec to cd6742a Compare February 20, 2022 15:49

thijssteel mentioned this pull request Jun 22, 2022

Add numerical tests for trevc3 #682

Merged

angsch force-pushed the master branch 3 times, most recently from 2e1e51f to f08506b Compare August 2, 2022 17:48

angsch force-pushed the master branch from f08506b to f933da7 Compare August 21, 2022 14:22

angsch added 7 commits September 14, 2022 20:12

Test [sd]trsyl3

833cd58

LAPACKE interface of [sd]trsyl3

46275f0

Add ctrsyl3 and corresponding tests

05f9e54

LAPACKE interface of [cz]trsyl3

8f44109

Use unblocked code if insufficient workspace is provided

1c1263e

angsch force-pushed the master branch from d8a56ca to bbc1f0e Compare September 14, 2022 18:44

Add BLAS-3 robust triangular solver T*X = B*diag(s1, ..., sN)

970a772

An extensions of LATRS solving T*x = s*x to many right-hand sides, allowing the usage of BLAS-3. The new algorithm tends to use less aggressive scaling, in particular for larger matrices.

angsch force-pushed the master branch from ce6a127 to 970a772 Compare September 18, 2022 13:26

angsch marked this pull request as ready for review September 25, 2022 16:22

langou approved these changes Oct 6, 2022

View reviewed changes

weslleyspereira approved these changes Oct 6, 2022

View reviewed changes

langou merged commit 7012e36 into Reference-LAPACK:master Oct 6, 2022

julielangou added this to the LAPACK 3.11.0 milestone Nov 12, 2022

bartoldeman mentioned this pull request Nov 22, 2022

Fix SLATRS3 and CLATRS3 tests #764

Merged

dklyuchinskiy mentioned this pull request Nov 23, 2022

Fixed bug in array bounds in complex syl01 test #758

Merged

ilayn mentioned this pull request May 26, 2023

ENH: Add wrapper for LAPACK functions stgsyl and dtgsyl scipy/scipy#18404

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add level-3 BLAS triangular Sylvester equation solver #651

Add level-3 BLAS triangular Sylvester equation solver #651

angsch commented Feb 20, 2022 •

edited

Loading

codecov bot commented Aug 2, 2022 •

edited

Loading

angsch commented Sep 25, 2022

Add level-3 BLAS triangular Sylvester equation solver #651

Add level-3 BLAS triangular Sylvester equation solver #651

Conversation

angsch commented Feb 20, 2022 • edited Loading

codecov bot commented Aug 2, 2022 • edited Loading

Codecov Report

angsch commented Sep 25, 2022

angsch commented Feb 20, 2022 •

edited

Loading

codecov bot commented Aug 2, 2022 •

edited

Loading