Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Rework on kernelmatrix to work with Vectors and more complex kernels #83

Merged
merged 26 commits into from
Apr 23, 2020

Conversation

theogf
Copy link
Member

@theogf theogf commented Apr 16, 2020

I added the notion of SimpleKernel for when Distances.jl pairwise can just be used.
While BaseKernel will just rely on kappa(k, x, y)
@willtebbutt Now I can see the point of ColVecs, is that okay if I steal your implementation from Stheno and rework it for our purpose?

[EDIT] This PR only aims at making the difference between SimpleKernel where kernelmatrix is based on pairwise from Distances.jl, BaseKernel where kernelmatrix relies on kappa(k, x, y) and Kernel which also contains composite kernels.
ColVecs and RowVecs are added separately in #84 and #89

@devmotion
Copy link
Member

I feel like this might be a good case for using traits instead of a SimpleKernel type.

@theogf
Copy link
Member Author

theogf commented Apr 16, 2020

I really don't think traits are needed here. It would force users who wants to add "simple" kernels to always add a issimple(x) for their kernels whereas the inheritance comes in very naturally. A SimpleKernel is just a BaseKernel where we have an optimized way of computing the kernel matrix.

@devmotion
Copy link
Member

I guess users might want to user their own type structure for various reasons (e.g., kernels on special spaces that you might want to group but where not every kernel is a simple or base kernel), and it's just not possible for them to exploit the general implementation using pairwise in KernelFunctions if this requires SimpleKernel.

It would force users who wants to add "simple" kernels to always add a issimple(x) for their kernels whereas the inheritance comes in very naturally.

You could get both by just defining

issimple(::Kernel) = false
issimple(::SimpleKernel) = true

or something similar with specific traits. So everyone that subtypes SimpleKernel would get it for free, and for custom types one could still hook into the pairwise computations by defining issimple.

@willtebbutt
Copy link
Member

@willtebbutt Now I can see the point of ColVecs, is that okay if I steal your implementation from Stheno and rework it for our purpose?

Please do :)

If we go ahead with this change, I'm unclear what the differences between SimpleKernel, BaseKernel, and Kernel are going to be though.

At the minute, BaseKernel roughly means "anything that's not a composite kernel" and by default uses the map-kappa–over-metric implementation for ease of implementation. What will SimpleKernel add to this, and what will it mean for a Kernel to be a BaseKernel but not a SimpleKernel?

@theogf
Copy link
Member Author

theogf commented Apr 16, 2020

SimpleKernel is using the pairwise strategy that we have from the beginning. The latest commit should be more explicit

@willtebbutt
Copy link
Member

Sorry, is it critical that we do the ColVecs implementation in this PR? Would be much easier to review as a separate PR.

@willtebbutt
Copy link
Member

SimpleKernel is using the pairwise strategy that we have from the beginning. The latest commit should be more explicit

Yup, but then why do we have BaseKernel?

@theogf
Copy link
Member Author

theogf commented Apr 16, 2020

Because KernelProduct etc are not BaseKernels

@theogf
Copy link
Member Author

theogf commented Apr 16, 2020

Sorry, is it critical that we do the ColVecs implementation in this PR? Would be much easier to review as a separate PR.

I can put an expensive collect(eachcol(X)) with a note for now

@willtebbutt
Copy link
Member

I can put an expensive collect(eachcol(X)) with a note for now

Okay, sorry, I think I've been getting myself confused by the purpose of this PR, but I still think it would be good to introduce the changes over the course of two PRs.

  1. Change from AbstractMatrix to AbstractVector, including adding the ColVecs type.
  2. Introduce the SimpleKernel abstraction.

AFAICT, the first is going to involve some fairly large changes to the codebase, so it would be good to do it as a single PR. Would you be okay with doing that @theogf ?

@theogf
Copy link
Member Author

theogf commented Apr 16, 2020

The second one implies that we need to be able to treat the matrices as vectors of vectors, hence the need for ColVecs. But you are right the description of the PR is now wrong.

@willtebbutt
Copy link
Member

Yup, so if we do 1 then 2 everything should be okay?

@theogf theogf mentioned this pull request Apr 17, 2020
@theogf
Copy link
Member Author

theogf commented Apr 18, 2020

So as you mentioned in #84 and #43, time to tackle obsdim!
This what I aimed at with this PR by changing kernelmatrix (I will take the case of kernelmatrix(k, X) but of course this generalizes to kernelmatrix(k, X, Y)

  • X is a Vector of Real, we reshape X into a matrix and call kernelmatrix on it
  • X is a Matrix (of Real), we need the obsdim indicator to know if the observation are rowwise or colwise. I know that for me I always have my observation rowwise (obsdim = 1).
    • If k is a SimpleKernel we can call pairwise from Distances.jl and pass the obsdim argument
    • if k is a BaseKernel we transform the matrix in a vec of vec via ColVecs
  • X is a Vector of Stuff, we iteratively call kappa(k, x, y) on each pair without passing by pairwise. No obsdim needed

@willtebbutt
Copy link
Member

willtebbutt commented Apr 18, 2020

I'm a little confused. I was expecting the following to happen:

  • obsdim is removed (almost) everywhere
  • All input types are changed fromAbstractMatrixs to AbstractVectors
  • Some additional low-level functions added so that the right thing happens when you call Distances-related functions on ColVecs and RowVecs.
  • (Maybe) helper functionality added at the top level to let people to continue to use the obsdim functionality if that's really what they prefer, to transform an input into either a ColVecs or RowVecs depending on the choice of obsdim, which can then be forwarded on to the rest of the package.

Why aren't we doing that?

edit: I know we don't actually have a RowVecs object at the minute, but lets pretend that we do for the sake of this discussion.

@theogf
Copy link
Member Author

theogf commented Apr 18, 2020

All input types are changed fromAbstractMatrixs to AbstractVectors

I don't understand the point of using a ColVecs wrapper when it's not necessary (for AbstractMatrix with SimpleKernels).

obsdim is removed (almost) everywhere

If I understand you correctly, the problem there is with obsdim is that we have to pass it to all the Transform objects right? Then we could just agree that the convention for the Transform is that we always have obsdim = 2 (so we remove the need for it), and when kernelmatrix is called with obsdim = 1, we just send the transpose!

@willtebbutt
Copy link
Member

If I understand you correctly, the problem there is with obsdim is that we have to pass it to all the Transform objects right?

No. The problem is that you ever have to pass an obsdim-like parameter to anything anywhere in the package (excluding Distances, but that's beyond our control). It makes everything harder to read and can be abstracted away completely! But we should take this discussion back to the relevant issue since we don't seem to have agreed on what the right solution is yet.

Since I've misunderstood, could you explain to me why you need ColVecs in this PR?

@willtebbutt willtebbutt mentioned this pull request Apr 21, 2020
@theogf
Copy link
Member Author

theogf commented Apr 22, 2020

For BaseKernel the only generic function possible to compute the function only directly on the objects (vectors). Therefore I needed to transform the input matrix into a vectors of vectors. Hence the need for ColVecs.

But as you stated in #43 we can make a uniform interface. I will just finish this PR so we can move on and get rid of obsdim in the internals

Comment on lines 12 to 15
κ::Kernel,
X::AbstractVector{<:Real}
)
kernelmatrix!(K, κ, reshape(X, 1, :), obsdim = 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we agreed not to do this? I guess more in line with the discussion we had would be to not import Distances.pairwise and define an internal method pairwise along the lines of

function pairwise(metric::PreMetric, X::AbstractMatrix; obsdim = defaultobs)
	return Distances.pairwise(metric, X; dims = obsdim)
end
function pairwise(metric::PreMetric, X::AbstractVector; obsdim = defaultobs)
	return Distances.pairwise(metric, reshape(X, 1, :); dims = 2)
end

and similarly for pairwise(metric, X, Y, ...) (and pairwise! maybe, not sure if that's useful?).

I guess then it would be easier to remove obsdim in one of the subsequent PRs and add support for ColVecs. It seems a bit weird to add this implementation with our plan in mind.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well right now it's not really an issue. If you want I can replace it by :
kernelmatrix!(K, κ, ColVecs(reshape(X, 1, :)))
This will be done anyway in the next PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, as far as I understand, the idea is to only use vectors so there is no reason to reshape and wrap it anymore?

Copy link
Member Author

@theogf theogf Apr 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh now I understand!
But it would raise the problem for SimpleKernel we would not use pairwise anymore. But I suppose that for 1D problem it will not make a big difference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we would - but the reshaping for Distances would only happen in the implementation of pairwise sketched above (since only Distances cares about it being a matrix instead of a vector).

Copy link
Member Author

@theogf theogf Apr 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐌 (there is no slowpoke emoji)

src/utils.jl Outdated Show resolved Hide resolved
@theogf theogf mentioned this pull request Apr 22, 2020
@theogf
Copy link
Member Author

theogf commented Apr 22, 2020

Oh yeah I could remove the @assert in kernelmatrix now!

Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bunch of picky style things. I'm happy for this to be merged once they've been addressed and tests have passed though.

@@ -15,6 +15,8 @@ end

kappa(k::ScaledKernel, x) = first(k.σ²) * kappa(k.kernel, x)

kappa(k::ScaledKernel, x, y) = first(k.σ²) * kappa(k.kernel, x, y)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why has this been added?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I realised that ScaleKernel could only be applied to SimpleKernel

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, okay. And this is kappa in the kappa == kernelmatrix sense, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh no, it's in the sense of k(x,y), it multiplies by sigma for every function call. But this can get replaced when implementing #87

src/matrix/kernelmatrix.jl Outdated Show resolved Hide resolved
src/matrix/kernelmatrix.jl Show resolved Hide resolved
src/matrix/kernelmatrix.jl Outdated Show resolved Hide resolved
src/matrix/kernelmatrix.jl Outdated Show resolved Hide resolved
src/matrix/kernelmatrix.jl Outdated Show resolved Hide resolved
src/matrix/kernelmatrix.jl Outdated Show resolved Hide resolved
src/matrix/kernelmatrix.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
src/utils.jl Outdated Show resolved Hide resolved
@theogf theogf merged commit fdd317f into JuliaGaussianProcesses:master Apr 23, 2020
@theogf theogf deleted the general_kernelmatrix branch August 4, 2020 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants