ARD Implementation and Automatic Differentation compatibility #82

theogf · 2019-01-11T17:39:59Z

Hello,

I made relatively heavy changes to your package with the objective to be able to have Automatic Relevance Determination and to be compatible with Automatic Differentation packages such as ForwardDiff.jl.

I changed the way distances were computed by directly applying the scale factor (appropriately) to the vectors. It might make the algorithm slower, I have not made benchmarks so far.
However it is still retrocompatible and using a scalar scaling work with everything.

I did not add tests yet for the ARD case, and I had to remove some cause of the needed transformation of the parameters (in the "Testing constructors" part) cause I did not know if the best option is to keep a copy of the original parameters or ignore this issue. Let me know if you would like other modifications.

Cheers,

theogf · 2019-01-11T17:48:02Z

Ah I just realised some features are not working... I will correct them and make a new PR
Tests for ARD are indeed important ^^

coveralls · 2019-01-11T17:52:54Z

Coverage decreased (-10.05%) to 84.106% when pulling 88c5a2a on theogf:master into 85cc602 on trthatcher:master.

coveralls · 2019-01-11T17:52:55Z

Coverage decreased (-10.05%) to 84.106% when pulling 88c5a2a on theogf:master into 85cc602 on trthatcher:master.

coveralls · 2019-01-11T17:52:56Z

Coverage decreased (-10.05%) to 84.106% when pulling 88c5a2a on theogf:master into 85cc602 on trthatcher:master.

coveralls · 2019-01-11T17:52:56Z

Coverage decreased (-10.05%) to 84.106% when pulling 88c5a2a on theogf:master into 85cc602 on trthatcher:master.

trthatcher · 2019-01-12T22:18:31Z

Hi @theogf!

I want to start by saying thanks for taking an interest in my project! I appreciate your enthusiasm to contribute; I see you’ve put a lot of effort into this pull request. I do want to support ARD & make AD possible.

I had hoped to release a version 1.0 to correspond to Julia 1.0 with the linear/periodic kernels fully deprecated and #65 closed before looking into adding ARD as a feature (I’m working on some documentation).

I’ve created issue #83 for ARD where I’d like to discuss the approach further. I’ve put a proposal in there which I think may end up a little cleaner.

trthatcher · 2019-01-12T22:24:48Z

src/kernelfunctions/mercer/exponential.jl

-        @check_args(ExponentialKernel, α, α > zero(T), "α > 0")
-        return new{T}(α)
+struct ExponentialKernel{T<:Real,A} <: AbstractExponentialKernel{T}
+    α::A


Is the A parameter really needed or could Union{AbstractVector{T},T} work instead? I found that having multiple parameters made things far more complex.

The A parameter has a nice property to become either a scalar or vector, and the second big advantage is that it will allow compatibility with CuArrays and other more complex arrays if other users want to use the package. But indeed one could restrict A to A<:Union{AbstractVector{T},T}

trthatcher · 2019-01-12T22:26:59Z

src/kernelfunctions/mercer/matern.jl

    ν::T
-    ρ::T
-    function MaternKernel{T}(ν::Real=T(1), ρ::Real=T(1)) where {T<:AbstractFloat}
+    α::A


I would prefer to keep the variable names consistent with the literature (but allow for vectors) - this was also one of the reasons for my proposal.

I also think keeping consistency is important! But adding a vector just for the scaling could be costly. The proposal of weights make sense and actually go in the direction of the way Distances.jl work. Is there actually a reason why you don't base your base_matrix functions on it?

trthatcher · 2019-01-12T22:27:32Z

src/kernelfunctions/mercer/exponential.jl

@@ -119,7 +122,7 @@ The ``\gamma``-exponential kernel is an isotropic Mercer kernel given by the for
    κ(x,y) = exp(α‖x-y‖²ᵞ)   α > 0, γ ∈ (0,1]
 ```
 where `α` is a scaling parameter and `γ` is a shape parameter of the Euclidean distance.
-When `γ = 1` use [`SquaredExponentialKernel`](@ref) and [`SquaredExponentialKernel`](@ref)
+When `γ = 1` use [`SquaredExponentialKernel`](@ref) and [`ExponentialKernel`](@ref)


Good catch!

trthatcher · 2019-01-12T22:40:04Z

Also added #84.

Part of the strength of this package is that it usually uses BLAS, so I have to see what has to change to support that.

theogf · 2019-05-24T14:38:26Z

Hello!

I reworked a lot of things to make things compatible and realized that I was getting pretty far away from the original package (notably by relying on Distances.jl a lot).
With @willtebbutt we decided to start a new Kernel package from scratch again, although we definitely draw a large inspiration from your package.
I wanted to make sure that this is ok with you or if you had any remarks.

trthatcher · 2019-05-24T15:48:52Z

@willtebbutt @theogf So funny story there - I was going to swap out the back-end for Distances.jl because it is strictly better that what I have here. I didn't use it originally because they only supported rows-as-observations (or was it columns?) at the time. Anyway, that's no longer the case.

Is this the new package? https://github.com/theogf/KernelFunctions.jl

I've been wanting to participate more in the community rather than just working solo on my own package(s). Any chance I can help with your package's development? I'd like to continue as a contributor.

I can also deprecate and link to yours.

theogf · 2019-05-24T16:07:27Z

That's amazing! Of course we would be more than happy to have you on the team! I will add you as a collaborator.
Are you on the Julia Slack ? We have our group discussion there.

theogf added 3 commits January 11, 2019 15:10

Moving to ARD, generalized all parameters to vector

9779490

Created constructors for each kernel, started adapting tests

065e123

Validation of all the tests

88c5a2a

Fixed constructor for ARD, but convert has still some issues

0ce0df5

trthatcher mentioned this pull request Jan 12, 2019

Implement Automatic Relevance Determination #83

Open

trthatcher reviewed Jan 12, 2019

View reviewed changes

Test for AD

ad07d19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARD Implementation and Automatic Differentation compatibility #82

ARD Implementation and Automatic Differentation compatibility #82

theogf commented Jan 11, 2019

theogf commented Jan 11, 2019

coveralls commented Jan 11, 2019

coveralls commented Jan 11, 2019

coveralls commented Jan 11, 2019

coveralls commented Jan 11, 2019

trthatcher commented Jan 12, 2019

trthatcher Jan 12, 2019

theogf Jan 17, 2019

trthatcher Jan 12, 2019

theogf Jan 17, 2019

trthatcher Jan 12, 2019

trthatcher commented Jan 12, 2019

theogf commented May 24, 2019

trthatcher commented May 24, 2019

theogf commented May 24, 2019

ARD Implementation and Automatic Differentation compatibility #82

Are you sure you want to change the base?

ARD Implementation and Automatic Differentation compatibility #82

Conversation

theogf commented Jan 11, 2019

theogf commented Jan 11, 2019

coveralls commented Jan 11, 2019

coveralls commented Jan 11, 2019

coveralls commented Jan 11, 2019

coveralls commented Jan 11, 2019

trthatcher commented Jan 12, 2019

trthatcher Jan 12, 2019

Choose a reason for hiding this comment

theogf Jan 17, 2019

Choose a reason for hiding this comment

trthatcher Jan 12, 2019

Choose a reason for hiding this comment

theogf Jan 17, 2019

Choose a reason for hiding this comment

trthatcher Jan 12, 2019

Choose a reason for hiding this comment

trthatcher commented Jan 12, 2019

theogf commented May 24, 2019

trthatcher commented May 24, 2019

theogf commented May 24, 2019