Skip to content

Commit

Permalink
simulate! and thus parametricbootstrap for GLMM (#418)
Browse files Browse the repository at this point in the history
* simulate! for GLMM

* fix docstring, remove some debugging code

* news update and version bump

* issingular for GLMM and docstring

* docstring, dead code removeal

* re-organize tests

* small docs update

* unit scaling + comments in simulate!

* 5-arg mul! instead of BLAS (∴ more matrix el/types)

* force Binomial n to be Int

* coalesce, thanks @dmbates
  • Loading branch information
palday authored Oct 29, 2020
1 parent 52ca2ca commit 0bf61a2
Show file tree
Hide file tree
Showing 12 changed files with 335 additions and 73 deletions.
7 changes: 7 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
MixedModels v3.1 Release Notes
========================

* `simulate!` and thus `parametricbootstrap` methods for `GeneralizedLinearMixedModel` [#418].
* Documented inconsistent behavior in `sdest` and `varest` `GeneralizedLinearMixedModel` [#418].

MixedModels v3.0.2 Release Notes
========================

Expand Down Expand Up @@ -91,3 +97,4 @@ Package dependencies
[#384]: https://github.com/JuliaStats/MixedModels.jl/issues/384
[#390]: https://github.com/JuliaStats/MixedModels.jl/issues/390
[#395]: https://github.com/JuliaStats/MixedModels.jl/issues/395
[#418]: https://github.com/JuliaStats/MixedModels.jl/issues/418
2 changes: 1 addition & 1 deletion Project.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
name = "MixedModels"
uuid = "ff71e718-51f3-5ec2-a782-8ffcbfa3c316"
author = ["Phillip Alday <[email protected]>", "Douglas Bates <[email protected]>", "Jose Bayoan Santiago Calderon <[email protected]>"]
version = "3.0.2"
version = "3.1.0"

[deps]
Arrow = "69666777-d1a9-59fb-9406-91d4454c9d45"
Expand Down
10 changes: 5 additions & 5 deletions docs/src/bootstrap.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Parametric bootstrap for linear mixed-effects models
# Parametric bootstrap for mixed-effects models

Julia is well-suited to implementing bootstrapping and other simulation-based methods for statistical models.
The `parametricbootstrap` function in the [MixedModels package](https://github.com/JuliaStats/MixedModels.jl) provides an efficient parametric bootstrap for linear mixed-effects models.
The `parametricbootstrap` function in the [MixedModels package](https://github.com/JuliaStats/MixedModels.jl) provides an efficient parametric bootstrap for mixed-effects models.

```@docs
parametricbootstrap
Expand Down Expand Up @@ -98,7 +98,7 @@ For example, if we bootstrap a model fit to the `sleepstudy` data
sleepstudy = MixedModels.dataset(:sleepstudy)
m2 = fit(
MixedModel,
@formula(reaction ~ 1+days+(1+days|subj)),
@formula(reaction ~ 1+days+(1+days|subj)),
sleepstudy,
)
```
Expand Down Expand Up @@ -131,15 +131,15 @@ sum(ρs .≈ -1)

Furthermore there are even a few cases where the estimate of the standard deviation of the random effect for the intercept is zero.
```@example Main
σs = @where(df2, :type .== "σ", :group .== "subj", :names .== "(Intercept)").value
σs = @where(df2, :type .== "σ", :group .== "subj", :names .== "(Intercept)").value
sum(σs .≈ 0)
```

There is a general condition to check for singularity of an estimated covariance matrix or matrices in a bootstrap sample.
The parameter optimized in the estimation is `θ`, the relative covariance parameter.
Some of the elements of this parameter vector must be non-negative and, when one of these components is approximately zero, one of the covariance matrices will be singular.

The `issingular` method for a `LinearMixedModel` object that tests if a parameter vector `θ` corresponds to a boundary or singular fit.
The `issingular` method for a `MixedModel` object that tests if a parameter vector `θ` corresponds to a boundary or singular fit.

This operation is encapsulated in a method for the `issingular` function.
```@example Main
Expand Down
40 changes: 27 additions & 13 deletions src/bootstrap.jl
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,9 @@ struct MixedModelBootstrap{T<:AbstractFloat}
end

"""
parametricbootstrap(rng::AbstractRNG, nsamp::Integer, m::LinearMixedModel;
parametricbootstrap(rng::AbstractRNG, nsamp::Integer, m::MixedModel;
β = coef(m), σ = m.σ, θ = m.θ, use_threads=false)
parametricbootstrap(nsamp::Integer, m::LinearMixedModel;
parametricbootstrap(nsamp::Integer, m::MixedModel;
β = coef(m), σ = m.σ, θ = m.θ, use_threads=false)
Perform `nsamp` parametric bootstrap replication fits of `m`, returning a `MixedModelBootstrap`.
Expand All @@ -44,24 +44,29 @@ The default random number generator is `Random.GLOBAL_RNG`.
# Named Arguments
`β`, `σ`, and `θ` are the values of `m`'s parameters for simulating the responses.
`σ` is only valid for `LinearMixedModel` and `GeneralizedLinearMixedModel` for
families with a dispersion parameter.
`use_threads` determines whether or not to use thread-based parallelism.
Note that `use_threads=true` may not offer a performance boost and may even
Note that `use_threads=true` may not offer a performance boost and may even
decrease peformance if multithreaded linear algebra (BLAS) routines are available.
In this case, threads at the level of the linear algebra may already occupy all
In this case, threads at the level of the linear algebra may already occupy all
processors/processor cores. There are plans to provide better support in coordinating
Julia- and BLAS-level threads in the future.
Julia- and BLAS-level threads in the future.
"""
function parametricbootstrap(
rng::AbstractRNG,
n::Integer,
morig::LinearMixedModel{T};
morig::MixedModel{T};
β::AbstractVector=coef(morig),
σ=morig.σ,
θ::AbstractVector=morig.θ,
use_threads::Bool=false,
) where {T}
β, σ, θ = convert(Vector{T}, β), T(σ), convert(Vector{T}, θ)
if σ !== missing
σ = T(σ)
end
β, θ = convert(Vector{T}, β), convert(Vector{T}, θ)
βsc, θsc, p, k, m = similar(β), similar(θ), length(β), length(θ), deepcopy(morig)

β_names = (Symbol.(fixefnames(morig))..., )
Expand Down Expand Up @@ -108,7 +113,7 @@ end

function parametricbootstrap(
nsamp::Integer,
m::LinearMixedModel;
m::MixedModel;
β = m.β,
σ = m.σ,
θ = m.θ,
Expand All @@ -129,14 +134,14 @@ function allpars(bsamp::MixedModelBootstrap{T}) where {T}
nresrow = length(bstr) * npars
cols = (
sizehint!(Int[], nresrow),
sizehint!(String[], nresrow),
sizehint!(String[], nresrow),
sizehint!(Union{Missing,String}[], nresrow),
sizehint!(Union{Missing,String}[], nresrow),
sizehint!(T[], nresrow),
)
nrmdr = Vector{T}[] # normalized rows of λ
for (i, r) in enumerate(bstr)
σ = r.σ
σ = coalesce(r.σ, one(T))
for (nm, v) in pairs(r.β)
push!.(cols, (i, "β", missing, String(nm), v))
end
Expand All @@ -159,7 +164,7 @@ function allpars(bsamp::MixedModelBootstrap{T}) where {T}
end
end
end
push!.(cols, (i, "σ", "residual", missing, σ))
r.σ === missing || push!.(cols, (i, "σ", "residual", missing, r.σ))
end
(
iter=cols[1],
Expand All @@ -186,7 +191,16 @@ function Base.getproperty(bsamp::MixedModelBootstrap, s::Symbol)
end
end

issingular(bsamp::MixedModelBootstrap) = map-> any.≈ bsamp.lowerbd), bsamp.θ)
"""
issingular(bsamp::MixedModelBootstrap)
Test each bootstrap sample for singularity of the corresponding fit.
Equality comparisons are used b/c small non-negative θ values are replaced by 0 in `fit!`.
See also [`issingular(::MixedModel)`](@ref).
"""
issingular(bsamp::MixedModelBootstrap) = map-> any.== bsamp.lowerbd), bsamp.θ)

function Base.propertynames(bsamp::MixedModelBootstrap)
[:allpars, :objective, , , :se, :coefpvalues, , :σs, , :inds, :lowerbd, :bstr, :fcnames]
Expand Down Expand Up @@ -290,7 +304,7 @@ function tidyσs(bsamp::MixedModelBootstrap{T}) where {T}
)
for (iter, r) in enumerate(bstr)
setθ!(bsamp, iter) # install r.θ in λ
σ = r.σ
σ = coalesce(r.σ, one(T))
for (grp, ll) in zip(keys(fcnames), λ)
for (cn, col) in zip(getproperty(fcnames, grp), eachrow(ll))
push!(result, NamedTuple{colnms}((iter, grp, Symbol(cn), σ * norm(col))))
Expand Down
58 changes: 55 additions & 3 deletions src/generalizedlinearmixedmodel.jl
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,10 @@ StatsBase.deviance(m::GeneralizedLinearMixedModel) = deviance(m, m.optsum.nAGQ)

fixef(m::GeneralizedLinearMixedModel) = m.β

function fixef!(v::AbstractVector{T}, m::GeneralizedLinearMixedModel{T}) where T
copyto!(fill!(v, -zero(T)), m.β)
end

objective(m::GeneralizedLinearMixedModel) = deviance(m)

"""
Expand Down Expand Up @@ -406,6 +410,8 @@ function Base.getproperty(m::GeneralizedLinearMixedModel, s::Symbol)
coef(m)
elseif s == :beta
m.β
elseif s == :objective
objective(m)
elseif s (, :sigma)
sdest(m)
elseif s == :σs
Expand All @@ -423,6 +429,12 @@ function Base.getproperty(m::GeneralizedLinearMixedModel, s::Symbol)
end
end

# this copy behavior matches the implicit copy behavior
# for LinearMixedModel. So this is then different than m.θ,
# which returns a reference to the same array
getθ(m::GeneralizedLinearMixedModel) = copy(m.θ)
getθ!(v::AbstractVector{T}, m::GeneralizedLinearMixedModel{T}) where {T} = copyto!(v, m.θ)

function StatsBase.loglikelihood(m::GeneralizedLinearMixedModel{T}) where {T}
r = m.resp
D = Distribution(m.resp)
Expand Down Expand Up @@ -455,6 +467,7 @@ Base.propertynames(m::GeneralizedLinearMixedModel, private::Bool = false) = (
:X,
:y,
:lowerbd,
:objective,
:σρs,
:σs,
:corr,
Expand Down Expand Up @@ -612,7 +625,18 @@ function Base.setproperty!(m::GeneralizedLinearMixedModel, s::Symbol, y)
end
end

sdest(m::GeneralizedLinearMixedModel{T}) where {T} = convert(T, NaN)
"""
sdest(m::GeneralizedLinearMixedModel)
Return the estimate of the dispersion, i.e. the standard deviation of the per-observation noise.
For models with a dispersion parameter ϕ, this is simply ϕ. For models without a
dispersion parameter, this value is `missing`. This differs from `disperion`,
which returns `1` for models without a dispersion parameter.
For Gaussian models, this parameter is often called σ.
"""
sdest(m::GeneralizedLinearMixedModel{T}) where {T} = dispersion_parameter(m) ? dispersion(m, true) : missing

function Base.show(io::IO, m::GeneralizedLinearMixedModel)
if m.optsum.feval < 0
Expand Down Expand Up @@ -644,6 +668,23 @@ function Base.show(io::IO, m::GeneralizedLinearMixedModel)
show(io, coeftable(m))
end

function stderror!(v::AbstractVector{T}, m::GeneralizedLinearMixedModel{T}) where {T}
# initialize to appropriate NaN for rank-deficient case
fill!(v, zero(T) / zero(T))

# the inverse permutation is done here.
# if this is changed to access the permuted
# model matrix directly, then don't forget to add
# in the inverse permutation
vcovmat = vcov(m)

for idx in 1:size(vcovmat,1)
v[idx] = sqrt(vcovmat[idx,idx])
end

v
end

"""
updateη!(m::GeneralizedLinearMixedModel)
Expand All @@ -662,9 +703,20 @@ function updateη!(m::GeneralizedLinearMixedModel)
m
end

varest(m::GeneralizedLinearMixedModel{T}) where {T} = one(T)
"""
varest(m::GeneralizedLinearMixedModel)
Returns the estimate of ϕ², the variance of the conditional distribution of Y given B.
For models with a dispersion parameter ϕ, this is simply ϕ². For models without a
dispersion parameter, this value is `missing`. This differs from `disperion`,
which returns `1` for models without a dispersion parameter.
For Gaussian models, this parameter is often called σ².
"""
varest(m::GeneralizedLinearMixedModel{T}) where {T} = dispersion_parameter(m) ? dispersion(m, true) : missing

# delegate GLMM method to LMM field
# delegate GLMM method to LMM field
for f in (
:feL,
:fetrm,
Expand Down
11 changes: 1 addition & 10 deletions src/linearmixedmodel.jl
Original file line number Diff line number Diff line change
Expand Up @@ -377,7 +377,7 @@ end
StatsBase.fitted(m::LinearMixedModel{T}) where {T} = fitted!(Vector{T}(undef, nobs(m)), m)

"""
fixef!(v::Vector{T}, m::LinearMixedModel{T})
fixef!(v::Vector{T}, m::MixedModel{T})
Overwrite `v` with the pivoted fixed-effects coefficients of model `m`
Expand Down Expand Up @@ -495,15 +495,6 @@ function Base.getproperty(m::LinearMixedModel{T}, s::Symbol) where {T}
end
end

"""
issingular(m::LinearMixedModel, θ=m.θ)
Test whether the model `m` is singular if the parameter vector is `θ`.
Equality comparisons are used b/c small non-negative θ values are replaced by 0 in `fit!`.
"""
issingular(m::LinearMixedModel, θ=m.θ) = any(lowerbd(m) .== θ)

function StatsBase.leverage(m::LinearMixedModel{T}) where {T}
# This can be done more efficiently but reusing existing tools is easier.
# The i'th leverage value is obtained by replacing the response with the i'th
Expand Down
9 changes: 9 additions & 0 deletions src/mixedmodel.jl
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,15 @@ Return a vector of condition numbers of the λ matrices for the random-effects t
"""
LinearAlgebra.cond(m::MixedModel) = cond.(m.λ)

"""
issingular(m::MixedModel, θ=m.θ)
Test whether the model `m` is singular if the parameter vector is `θ`.
Equality comparisons are used b/c small non-negative θ values are replaced by 0 in `fit!`.
"""
issingular(m::MixedModel, θ=m.θ) = any(lowerbd(m) .== θ)

function retbl(mat, trm)
merge(
NamedTuple{(fname(trm),)}((trm.levels,)),
Expand Down
Loading

2 comments on commit 0bf61a2

@palday
Copy link
Member Author

@palday palday commented on 0bf61a2 Oct 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator register()

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/23853

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v3.1.0 -m "<description of version>" 0bf61a234ca6a76fe55261f92793f5f7bffaca30
git push origin v3.1.0

Please sign in to comment.