Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offset for Poisson models #453

Closed
vals opened this issue Dec 18, 2020 · 6 comments · Fixed by #482
Closed

Offset for Poisson models #453

vals opened this issue Dec 18, 2020 · 6 comments · Fixed by #482
Labels
bug docs Documentation Issues enhancement

Comments

@vals
Copy link

vals commented Dec 18, 2020

Hi,

I might be missing something, but there doesn't seem to be any way to provide per-observation constant offsets for the GeneralizedLinearMixedModel?

The fit function supports an argument offset, but it doesn't appear to do anything. (Fitting with and without setting this gives the same result).

In the GLM.jl package the fit function for GeneralizedLinearModel has an offset argument that works.

How can I add an offset to a Poisson GLMM in MixedModels.jl?

Best,
/Valentine

@palday
Copy link
Member

palday commented Dec 18, 2020

Offsets are currently not implemented. Somewhat confusingly some of the constructors take an offset argument, but nothing is done with this and the documentation for the GeneralizedLinearMixedModel and LinearMixedModel types don't mention it.

Can you point me to where you saw offsets in the documentation? I can't find it ... and we don't provide docstring for fit with MixedModel beyond the one inherited from StatsBase. Can you point me to where you ran across this (so that I can also make sure to fix the docs when I implement offset)?

@vals
Copy link
Author

vals commented Dec 18, 2020

Thank you!

I actually didn't see it in the documentation (and I was confused by this). I tried based on the GLM.jl convention and it was accepted, but as you mention, it doesn't do anything.

Are there plans to implement an offset? I haven't compared with GLMM packages that do support offsets yet. But when I include the variable I would add as an offset I get coefficients slightly different from 1.0 depending on the dataset, usually between 0.8 and 1.2. I am worried this biases the estimates for the fixed effects I am interested in.

Best,
/Valentine

@palday
Copy link
Member

palday commented Dec 18, 2020

It's on my todo list now -- I think this is a relatively straightforward change once I figure out / remember where to make it ....

Offsets are interesting creatures. If you know for sure that it should be exactly one, then you definitely need an offset. If you aren't sure, then allowing the coefficient to vary freely may be a good idea. That comment applies equally well to GLM and GLMMs, and without a lot more knowledge about your research question and model (and potentially a large amount of domain knowledge), I really don't know what would be best for your data.

Allowing it to vary freely will tend to impact the other coefficients. You can see this if you look at vcov, which gives you the variance-covariance matrix of the fixed effects. If the variable you want to be an offset covaries / is strongly correlated with other terms, then fixing it will definitely have an impact!

@vals
Copy link
Author

vals commented Dec 21, 2020

Thanks Phillip!

In my context the constant 1.0 coefficient for the offset is an assumption when we make the experimental design. Briefly, we count molecules from a gene relative to molecules from other genes. Using control experiments it's been found that when there are no experimental conditions the 1.0 coefficient is appropriate.

Thanks for the idea of looking at the vcov! The offset covary less with the other fixed effects than the fixed effects covary with each other. But it is still non-zero.

Regards,
/Valentine

@dmbates
Copy link
Collaborator

dmbates commented Mar 14, 2021

@vals I think that after #482 is merged offsets should be handled correctly. Please see the last testset in test/pirls.jl for the syntax. Note that the data table name must be specified to extract a column as an offset.

One possible source of confusion is that the value of the deviance for Poisson GLMMs is considerably different than that reported for lme4 fits. I think the value here is correct and the value from lme4 is wrong.

@vals
Copy link
Author

vals commented Mar 16, 2021

Thank you! Adding the data table name should be perfectly fine for any application I can think of.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug docs Documentation Issues enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants