Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more methods for Hessenberg factorizations #31738

Closed
4 of 6 tasks
stevengj opened this issue Apr 16, 2019 · 11 comments
Closed
4 of 6 tasks

more methods for Hessenberg factorizations #31738

stevengj opened this issue Apr 16, 2019 · 11 comments
Labels
linear algebra Linear algebra

Comments

@stevengj
Copy link
Member

stevengj commented Apr 16, 2019

I just noticed that the LinearAlgebra package only supports a very small number of specialized functions for Hessenberg factorization objects (#7965), which limits their practical utility. Some things that would be nice to have:

This came up on discourse, where someone wanted to solve (A - λ*I) \ b problems repeatedly for many different λ — a nice way to do this is via a Hessenberg factorization of A = QHQ' since A - λ*I = Q(H-λ*I)Q' and subsequent solves are O(n²). You could also use Schur for this sort of thing, but computing the Schur factorization is a lot more expensive (≈7×).

@stevengj stevengj added the linear algebra Linear algebra label Apr 16, 2019
@mfalt
Copy link
Contributor

mfalt commented Apr 23, 2019

I could try to implement these features. AFAIK H\b is usually solved as in
https://github.com/JuliaLinearAlgebra/GenericLinearAlgebra.jl/blob/master/src/eigenGeneral.jl#L28
using some simple iterations like

"""Assuming Hessenberg H"""
function ldiv!(H, rhs)
    n = size(H,1)
    for i = 1:n-1
        Gi, r = givens(H, i, i+1, i)
        lmul!(Gi, H)  # Note: slow indexing
        lmul!(Gi, rhs)
    end
    # H is now upper triangular
    ldiv!(UpperTriangular(H), rhs)
    return rhs
end

However, the operation lmul!(Gi, H) modifies two rows of H, which can be quite slow compared to working on the columns, there are ways to get around this but they are quite ugly, for example:

"""Assuming Hessenberg H"""
function ldiv!(H::AbstractMatrix{T}, rhs) where {T}
    n = size(H,1)
    Ht = H'
    rotations = LinearAlgebra.Givens{T}[] # Can be replaced with LinearAlgebra.Rotation
    for i = n:-1:2
        Gi, r = givens(Ht, i, i-1, i)
        push!(rotations, Gi')
        rmul!(H, Gi')
    end
    # H is now upper triangular
    ldiv!(UpperTriangular(H), rhs)
    # Should be possible to do lmul!(rotation::Rotation, rhs) here
    for i in n-1:-1:1
        lmul!(rotations[i], rhs)
    end
    return rhs
end

where it can be cleaned up a bit if
https://github.com/JuliaLang/julia/blob/master/stdlib/LinearAlgebra/src/givens.jl#L368
is relaxed to accept vectors (Rotation*Vector currently throws error).
This second version is a bit faster, for the cost of readability, it uses the fact that

Hx=b <==>
H(G1*G2*...Gn)*(Gn'*Gn-1'*...*G1')x=b <==>
(Gn'*Gn-1'*...*G1')x = (H*(G1*G2*...Gn))\b
x = (Gn'*Gn-1'*...*G1')*(H*(G1*G2*...Gn))\b

Where the G1,G2,...Gn that are needed to make H*(G1*G2*...Gn) upper triangular are calculated as the transpose of givens on the transpose of H, starting with the last column, instead of first row.
Is it worthwhile using this approach or is there a good standard way of making the original version efficient from a memory-layout perspective?

Edit: the original discussion is found here https://discourse.julialang.org/t/how-to-optimize-simple-matrix-equation-solving/23035/16

@stevengj
Copy link
Member Author

stevengj commented Apr 23, 2019

I'm not sure that Givens QR is state-of-the art for H \ b here. In particular, this paper says it is possible to combine the triangularization with the backsolve into a single pass over the array (without modifying H or allocating any temporary matrix).

Also, since it is extremely common to use Hessenberg factorizations with a shift H - λ*I, it might be nice to build this into the Hessenberg type so that it can be done without allocating a new matrix:

  • add a λ field into the Hessenberg type which defaults to zero
  • define ±(H::Hessenberg, D::UniformScaling) = Hessenberg(H, H.λ ± D.λ)
  • incorporate the shift λ into the algorithm for ldiv! etcetera.

@kshyatt kshyatt added the good first issue Indicates a good issue for first-time contributors to Julia label Apr 25, 2019
@stevengj stevengj removed the good first issue Indicates a good issue for first-time contributors to Julia label Apr 25, 2019
@stevengj
Copy link
Member Author

@kshyatt, I removed the "good first issue" tag because implementing these algorithms well is somewhat nontrivial…

@stevengj
Copy link
Member Author

stevengj commented Apr 26, 2019

For reference, here is a first implementation of the Henry algorithm; it seems to be about 10x faster than the routine by @mfalt above (and allocates much less memory) for a 1000x1000 matrix:

# solve (H - µI)x =b, with output written in-place in b
function ldiv_H!(F::Hessenberg, b::AbstractVector, μ=0)
    n = size(F.factors, 1)
    n != length(b) && throw(DimensionMismatch("wrong right-hand-side length != $n"))
    H = F.factors
    u = H[:,n] # temporary vector
    u[n] -= μ
    x = b # not a copy, just rename to match paper
    cs = Vector{Tuple{eltype(u),eltype(u)}}(undef, length(u)) # store Givens rotations
    @inbounds for k = n:-1:2
        Φ, ρ = givens(u[k], H[k,k-1], 1,2)
        cs[k] = c, s = Φ.c, Φ.s
        x[k] /= ρ
        t₁ = s * x[k]; t₂ = c * x[k]
        @simd for j = 1:k-2
            x[j] -= u[j]*t₂ + H[j,k-1]*t₁
            u[j] = H[j,k-1]*c - u[j]*s'
        end
        x[k-1] -= u[k-1]*t₂ + (H[k-1,k-1] - μ) * t₁
        u[k-1] = (H[k-1,k-1] - μ) * c - u[k-1]*s'
    end
    τ₁ = x[1] / u[1]
    @inbounds for k = 2:n
        τ₂ = x[k]
        c, s = cs[k]
        x[k-1] = c*τ₁ + s*τ₂
        τ₁ = c*τ₂ - s'τ₁
    end
    x[n] = τ₁
    return x
end

@olof3
Copy link
Contributor

olof3 commented Dec 11, 2019

Thanks, nice PR! I've been looking at how to use the functionality for computing frequency responses of linear systems. It seems to work nicely, so these are just a few minor things.

Q \ B is defined for HessenbergQ but it does not exploit the structure of Q and seems to be surprisingly slow, the equivalent Q' * B is fast as expected. Being able to use \ would perhaps make the code more natural and avoid the need to remember that Q is unitary.

\ is currently not defined for UpperHessenberg

Calling rdiv!(B, H; shift=s) with s=Inf does not return. Perhaps it would make sense to return a zero vector if A and B are finite? Or just throw an error?

The hessenbergQ function does not seem to be type stable?

using Test
F = hessenberg(randn(4,4))
@inferred getproperty(F, :Q)

gives
return type LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false} does not match inferred return type Any error(::String) at error.jl:33 top-level scope at experiment_freqresp.jl:104

@andreasnoack
Copy link
Member

It looks like our reflection functions don't agree on this

julia> @inferred LinearAlgebra.HessenbergQ(F)
ERROR: return type LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false} does not match inferred return type LinearAlgebra.HessenbergQ{_A,Array{Float64,2},Array{Float64,1},false} where _A
Stacktrace:
 [1] error(::String) at ./error.jl:33
 [2] top-level scope at REPL[21]:1

julia> @code_warntype LinearAlgebra.HessenbergQ(F)
Variables
  #self#::Type{LinearAlgebra.HessenbergQ}
  F::Hessenberg{Float64,UpperHessenberg{Float64,Array{Float64,2}},Array{Float64,2},Array{Float64,1},Bool}

Body::LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false}
1%1 = Base.getproperty(F, :factors)::Array{Float64,2}%2 = LinearAlgebra.eltype(%1)::Core.Compiler.Const(Float64, false)
│   %3 = $(Expr(:static_parameter, 1))::Core.Compiler.Const(Array{Float64,2}, false)
│   %4 = $(Expr(:static_parameter, 2))::Core.Compiler.Const(Array{Float64,1}, false)
│   %5 = Core.apply_type(LinearAlgebra.HessenbergQ, %2, %3, %4, false)::Core.Compiler.Const(LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false}, false)
│   %6 = Base.getproperty(F, :uplo)::Char%7 = Base.getproperty(F, :factors)::Array{Float64,2}%8 = Base.getproperty(F, )::Array{Float64,1}%9 = (%5)(%6, %7, %8)::LinearAlgebra.HessenbergQ{Float64,Array{Float64,2},Array{Float64,1},false}
└──      return %9

@olof3
Copy link
Contributor

olof3 commented Dec 11, 2019

Ah, my bad, it is this thing.

@andreasnoack
Copy link
Member

No. It's not #3021. I think it might just be an issue with @inferred.

@stevengj
Copy link
Member Author

@olof3, PRs to fix any of those issues would be welcome. In any case, feel free to file a new issue.

@olof3
Copy link
Contributor

olof3 commented Dec 11, 2019

@andreasnoack I think it is better if you submit an issue if you think that it is warranted, I know too little about how it is supposed to work.

@stevengj I'll see what I can do.

@stevengj
Copy link
Member Author

Closing this issue since at this point we should file more specific issues for remaining functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

No branches or pull requests

5 participants