Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support extracting factors from cholfact(::SparseMatrixCSC) #10402

Merged
merged 2 commits into from
Jun 6, 2015

Conversation

timholy
Copy link
Sponsor Member

@timholy timholy commented Mar 5, 2015

The behavior this is intended to support is:

# Suppose A is sparse
F = cholfact(A)
L = F[:L]
x = L\b

While I was at it, I noticed a couple of other issues that I thought should be fixed. The most important is that with this pull request, sparse(cholfact(A)) now returns something equivalent to the original matrix, just like full(cholfact(A)) does for dense matrices. This is a breaking change, but for consistency I think it seems warranted---it's my impression that the factorization object is supposed to be a representation of the original, not one of the factors.

CC @andreasnoack

@ViralBShah ViralBShah added the sparse Sparse arrays label Mar 5, 2015
sparse(L::Factor) = sparse(Sparse(L))
function sparse(L::Factor)
A = Sparse(L)
A*A'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would have to check if it is an LDLt or an LLt version of Factor. I'm sympathetic to changing the behavior of sparse(Factor), but then we'd have to consider the permutations. What is returned from this is still not the same as the input because of them.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thank you, I missed all of those subtleties. If I understand correctly, you're fine with changing it as long as I get the logic right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. That is correct.

@timholy
Copy link
Sponsor Member Author

timholy commented Mar 7, 2015

OK, this is a much more comprehensive version. CHOLMOD supports the notion of "combined factors" like LD in an LDLT factorization. I took this another step further and supported pivoting in a similar manner, since pivoting is on by default for these factorizations. If you don't like that part and prefer to have users handle pivoting manually, I can yank it out---but see below for additional motivation for why this might be good to support.

This will need some documentation, but I'll write that once we've settled on the functionality.

In implementing this, I became a bit bothered by the asymmetries between dense and sparse (e.g., pivoting is on by default for sparse, but not for dense). From an algorithmic standpoint there are some excellent reasons for those choices, but it does mean that code receiving an AbstractMatrix might need to do this:

function foo(A::AbstactMatrix, v)
    F = cholfact(A)
    L = F[:L]
    p = issparse(A) ? F[:p] : (1:size(A,1))
    w = L\v[p]
    ...

or you'll get errors in your results. (If you ignore the fact that Lis computed with pivoting in the sparse case, you get answers that are simply wrong.) This was less of a problem before pesky folks like me 😄 wanted to extract the :L factors, but it was already an issue with sparse(F) (whose behavior is changing in this PR).

I see a couple of few approaches:

  • EDITED: Make pivoting off by default for sparse matrices. This will hurt performance, but if users have to enable it then they are also taking responsibility for making sure their code is giving the right answer.
  • Encouraging the use of L = F[:PtL] (and supporting this notation in both types of dense Cholesky factorizations)
  • At least provide a uniform interface for the pivot=Val{true} case, so that users who worry about this can always turn on pivoting.

The first seems like the most robust solution, followed by the second.

@tkelman
Copy link
Contributor

tkelman commented Mar 8, 2015

This will hurt performance

That's a serious understatement. Disabling pivoting can be disastrous for performance in the sparse case. I don't like turning it off by default - it appears to be what Matlab does (which is not necessarily a good thing), and would be a bit more consistent, but it's not a very satisfactory way of dealing with this. I see your third item as mandatory, and second as a good idea, regardless of whether or not we decide to do the first.

@timholy
Copy link
Sponsor Member Author

timholy commented Mar 8, 2015

Yes, it would be a big performance hit. Moreover, in cases where the main use is

F = cholfact(A)
x = F\b

then permutation is not a source of error because it's all handled internally.

So, I reluctantly agree that we can't disable pivoting by default for the sparse case. Should we conceivably turn it on for the dense case, just so the two are consistent? The problem is that would surely break any existing code that uses F[:U] or F[:L].

Maybe the best bet is to live with the inconsistency, but do items 2 and 3 above, and be very clear about this in the documentation.

Curiously, this is an advantage of L = chol(A, :L): you know what use L will be put to next. That's not true for an abstract factorization object.

@timholy
Copy link
Sponsor Member Author

timholy commented Mar 8, 2015

Out of curiosity, does anyone have a sense for the overhead on something like

PtL = Pt*L
PtL\b

where Pt is a permutation and L is lower triangular? If the linalg routines correctly detect that PtL has a simpler representation as Pt*L (automatically deducing the correct Pt), then they won't have to perform a "naive" LU factorization of PtL (which, if it happened, would be far too expensive to tolerate). But I worry we can't count on that kind of smarts, or those smarts being fast enough.

@@ -576,7 +617,7 @@ for Ti in IndexTypes
d
end

function spsolve{Tv<:VTypes}(sys::Integer, F::Factor{Tv,$Ti}, B::Sparse{Tv,$Ti})
function solve{Tv<:VTypes}(sys::Integer, F::Factor{Tv,$Ti}, B::Sparse{Tv,$Ti})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to match the names in CHOLMOD for the wrappers. Then CHOLMOD can be the documentation for wrappers as well. If we begin to change that it will require documenting all the wrapper function ourself which I think would be the a waste of time.

Copy link
Sponsor Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree there's merit in that, but it would have meant duplicating

julia/base/sparse/cholmod.jl

Lines 1192 to 1224 in 137b19e

function (\){T,Ti}(L::FactorComponent{T,Ti,:L}, B::Union(Dense,Sparse))
solve(CHOLMOD_L, Factor(L), B)
end
function (\){T,Ti}(L::FactorComponent{T,Ti,:U}, B::Union(Dense,Sparse))
solve(CHOLMOD_Lt, Factor(L), B)
end
# Solve PLx = b and L'P'x=b where A = P*L*L'*P'
function (\){T,Ti}(L::FactorComponent{T,Ti,:PtL}, B::Union(Dense,Sparse))
F = Factor(L)
solve(CHOLMOD_L, F, solve(CHOLMOD_P, F, B)) # Confusingly, CHOLMOD_P solves P'x = b
end
function (\){T,Ti}(L::FactorComponent{T,Ti,:UP}, B::Union(Dense,Sparse))
F = Factor(L)
solve(CHOLMOD_Pt, F, solve(CHOLMOD_Lt, F, B))
end
# Solve various equations for A = L*D*L' and A = P*L*D*L'*P'
function (\){T,Ti}(L::FactorComponent{T,Ti,:D}, B::Union(Dense,Sparse))
solve(CHOLMOD_D, Factor(L), B)
end
function (\){T,Ti}(L::FactorComponent{T,Ti,:LD}, B::Union(Dense,Sparse))
solve(CHOLMOD_LD, Factor(L), B)
end
function (\){T,Ti}(L::FactorComponent{T,Ti,:DU}, B::Union(Dense,Sparse))
solve(CHOLMOD_DLt, Factor(L), B)
end
function (\){T,Ti}(L::FactorComponent{T,Ti,:PtLD}, B::Union(Dense,Sparse))
F = Factor(L)
solve(CHOLMOD_LD, F, solve(CHOLMOD_P, F, B))
end
function (\){T,Ti}(L::FactorComponent{T,Ti,:DUP}, B::Union(Dense,Sparse))
F = Factor(L)
solve(CHOLMOD_Pt, F, solve(CHOLMOD_DLt, F, B))
end
. However, I can do that if you prefer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great. If you loop through the types and names it is almost the same number of lines of code.

@andreasnoack
Copy link
Member

I agree with @tkelman that the default has to be pivoting for sparse matrices. I'm pretty sure the main use of sparse Cholesky is to solve Ax=b so the default should be to do that as good as possible. I also think that the default has to be without pivoting for dense matrices and I think that it is okay with this difference between dense and sparse. As I wrote above, I still think we can support generic programming by introducing a permutation type.

@timholy
Copy link
Sponsor Member Author

timholy commented Mar 8, 2015

OK, I've restored spsolve.

If you're fine with this, then I can add some documentation. However, if you're still thinking about the "bundling" of factors in :PtL, we should come to some conclusion.

@andreasnoack
Copy link
Member

It's a tradeoff between the convenience of special type and the challenge of communicating how the type works and when it should be used. For cholfact (and the other factorization) I think the story is relatively straight forward, i.e. a factorized representation of the original matrix. For the factorizations, it is also convenient that there are several factorization that share the same kind of abstraction. I fear that the [PtL] story is more complicated and that it would be easier to reason about the pieces separately. But it is a judgement call, I'm don't know if my judgement is right here.

@timholy
Copy link
Sponsor Member Author

timholy commented Mar 8, 2015

Might be best to let this simmer a bit then, and see how thoughts evolve over time---I agree that the correct path is not obvious.

Just to summarize for the benefit of anyone else who wants to chime in:

  • F = somefactorization(A) can often compute the factorization in one of two ways: with pivoting and without. This is relevant for chol, qr, and lu, and perhaps others.
  • Factorization objects are nice for bundling together all the information for any kind of factorization, regardless of whether pivoting is used. In particular, F\b is a convenient, accurate, and performant way to solve A\b for multiple right hand sides.
  • The challenges come in extracting individual factors: if pivoting is not used, then F[:L] "tells the whole story" for something like Cholesky factorization, and L, U = F[:L], F[:U] tells the whole story for LU factorization. However, if pivoting is used, these are incomplete: you also need to extract one or more permutation "matrices." For example, with sparse LU, you have A = P'*L*U*Q', and so inversion must be written (if you're doing it by hand) as Q'\(U\(L\(P'\b))).
  • Users who rely on just the main factors (e.g., L = cholfact(A)[:L]) will get wrong answers if pivoting was used in computing the factorization and they fail to realize that. This is a problem that Matlab's [L, rank, perm] = chol(A, 'lower') neatly avoids, because you can disable pivoting if only one output is used but enable it if three outputs are used. With the intermediary of a "factorization object," we can't tell whether the user is aware that pivoting might be important.
  • Pivoting by default is desirable for sparse matrices. It's less obvious this is true for dense factorizations. One risk is that users may get "trained" by their experience with dense matrices that they don't need to worry about the permutation matrices, and may cause problems when they start tackling sparse problems.
  • One potential solution is to encourage the use of "bundled" factors. In the case of Cholesky factorization, PtL = cholfact(A)[:PtL] would represent P'*L; for LU factorization we could have F = lufact(A); PtL, UQt = F[:PtL], F[:UQt]. The idea is that we would represent the "main" factors in combination with their associated permutation matrices (if relevant). An advantage is that this represents a uniform interface that works whether or not pivoting is used.
  • There are possible downsides with "bundled" factors: potentially more types to support, and somewhat nonconvential by comparison to Matlab and other systems. Moreover, converting a "bundled" type to its raw dense representation is likely to lead to a performance gap relative to its representation as a PermutedFactor (or whatever it would be called).

Is that a fair summary? If not, feel free to edit this post directly.

@tkelman
Copy link
Contributor

tkelman commented Mar 8, 2015

two ways

A few more than that. Not just binary yes/no are you using pivoting or not, there's also the question of which kind of pivoting you use. Partial? Complete? Rook? Rank-revealing? Bunch-Parlett? Bunch-Kaufman (and there are several variants, A, B, C, D)? Symmetric 1-by-1? Block-restricted versions of these? etc... edit: these would not necessarily need different API's for the outputs, but a Boolean input isn't quite rich enough to ask for different algorithm variants in the function call.

Otherwise your summary is very fair. It's a messy, complicated problem space, that is challenging to design generalizable API's for. To slightly reduce the name salad we currently have with cholfact, lufact, ldltfact, qrfact, etc, I've thrown around the idea of maybe dispatching factorize on a Val{:Cholesky} input. Have not yet written any code to implement that idea though.

@timholy
Copy link
Sponsor Member Author

timholy commented Mar 8, 2015

CC @alanedelman to see if he has any ideas (see two posts up).

@andreasnoack
Copy link
Member

I've thrown around the idea of maybe dispatching factorize on a Val{:Cholesky}

I still like the idea, but Val is maybe not the right solution because having a Cholesky type and at the time using the symbol :Cholesky might be confusing. It would probably be clearer to use the Cholesky type, but it reveals a problem. Right now the return type of cholfact can be Cholesky, CholeskyPivoted or Factor. With a factorize(Matrix,Factorization) method it would make sense to specify the exact type to return (except for the element type), e.g. factorize(A, Cholesky).

This brings us back to this issue because I'm wondering if we should explore the possibility of wrapping Factor in a modyfied CholeskyPivoted or Cholesky. Now the pivoting is hidden inside the sparse factorization and therefore type information is lost.

@timholy
Copy link
Sponsor Member Author

timholy commented Mar 9, 2015

I kind of like the idea of wrapping it in something a little more informative. An LDLt type might also be worth having.

However, the instantiation of the types in the dense and sparse cases is quite different, so I suspect these would have to become abstract/parametric types.

@ViralBShah ViralBShah added this to the 0.4.0 milestone Mar 25, 2015
@timholy
Copy link
Sponsor Member Author

timholy commented Apr 27, 2015

Noted the reference on the mailing list. I'm not exactly sure where we left this. I guess I still favor the notion of "bundling" factors. @andreasnoack, what are your thoughts? I'm not in a massive rush; the project that needed this is currently on the back burner due to other priorities, but I hope to get back to it within a month or so, so by then it would be good to have a vision.

@andreasnoack
Copy link
Member

I'm on board with the compound factors.

@timholy
Copy link
Sponsor Member Author

timholy commented Apr 28, 2015

OK, I'll try to get back to this once I finish #10911.

@ViralBShah
Copy link
Member

Bump. Would be nice to merge this.

Also supports extracting factors from ldltfact.
A wide variety of "unconventional factors," like :PtL, are supported
because of the importance of pivoting in sparse factorizations.

This also:
- changes the behavior of sparse(cholfact(A)) to return the original matrix,
  just like full(cholfact(A)) for dense matrices  *breaking*
- Doesn't export transpose(::Sparse, ::Integer) (that seems like an internal interface)
- Adds support for ctranspose
@timholy
Copy link
Sponsor Member Author

timholy commented Jun 5, 2015

Rebased and documentation added.

timholy added a commit that referenced this pull request Jun 6, 2015
Support extracting factors from cholfact(::SparseMatrixCSC)
@timholy timholy merged commit dff768a into master Jun 6, 2015
@timholy timholy deleted the teh/cholmod branch June 6, 2015 12:56
@tkelman
Copy link
Contributor

tkelman commented Jun 6, 2015

Good stuff.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants