Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Support non-integer dependent variable for Poisson regression #494

Closed
yangkeunyun opened this issue Aug 5, 2022 · 0 comments

Comments

@yangkeunyun
Copy link

Hi,

I know that the Poisson distribution is technically defined for integers. But in many cases, there are times that it could be a good approximation by treating non-integer variable as Poisson. Other statistical programs (e.g. R, Stata) seem to take it. I hope it is possible to do a similar approach in Julia. Below is the MWE (including a comparison with R code) and stacktrace:

using DataFrames, GLM, Random, Distributions
using RCall

# Generate random dataset with Float64 type dependent variable
df = DataFrame(y  = rand(Uniform(0,10), 10),
               x1 = randn(10),
               x2 = randn(10))

# Poisson Regression in R
println("This is Poisson regression in R")
R"print(summary(glm(y ~ x1 + x2, family=poisson, data=$df)))"

# Poisson Regression in Julia
println("\n This is Poisson regression in Julia")
fit(GeneralizedLinearModel, @formula(y ~ x1 + x2), df, Poisson())     
This is Poisson regression in R

Call:
glm(formula = y ~ x1 + x2, family = poisson, data = `#JL`$df)

Deviance Residuals:
    Min       1Q   Median       3Q      Max
-2.5851  -1.3313   0.2815   1.0060   1.5319

Coefficients:
            Estimate Std. Error z value Pr(>|z|)
(Intercept)   1.8127     0.1315  13.782   <2e-16 ***
x1            0.1248     0.1370   0.911    0.363
x2           -0.1083     0.1269  -0.853    0.394
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 22.002  on 9  degrees of freedom
Residual deviance: 20.690  on 7  degrees of freedom
AIC: Inf

Number of Fisher Scoring iterations: 5

┌ Warning: RCall.jl: Warning in dpois(y, mu, log = TRUE) : non-integer x = 9.620932
│ Warning in dpois(y, mu, log = TRUE) : non-integer x = 7.250401
│ Warning in dpois(y, mu, log = TRUE) : non-integer x = 1.245370
│ Warning in dpois(y, mu, log = TRUE) : non-integer x = 8.879959
│ Warning in dpois(y, mu, log = TRUE) : non-integer x = 4.531989
│ Warning in dpois(y, mu, log = TRUE) : non-integer x = 9.078459
│ Warning in dpois(y, mu, log = TRUE) : non-integer x = 9.979947
│ Warning in dpois(y, mu, log = TRUE) : non-integer x = 1.019928
│ Warning in dpois(y, mu, log = TRUE) : non-integer x = 7.847729
│ Warning in dpois(y, mu, log = TRUE) : non-integer x = 2.645772
└ @ RCall C:\Users\yangk\.julia\packages\RCall\6kphM\src\io.jl:172

 This is Poisson regression in Julia
ERROR: ArgumentError: y must be in the support of D
Stacktrace:
 [1] checky
   @ C:\Users\yangk\.julia\packages\GLM\P0Ris\src\glmfit.jl:642 [inlined]
 [2] GLM.GlmResp(y::Vector{Float64}, d::Poisson{Float64}, l::LogLink, η::Vector{Float64}, μ::Vector{Float64}, off::Vector{Float64}, wts::Vector{Float64})
   @ GLM C:\Users\yangk\.julia\packages\GLM\P0Ris\src\glmfit.jl:36
 [3] GLM.GlmResp(y::Vector{Float64}, d::Poisson{Float64}, l::LogLink, off::Vector{Float64}, wts::Vector{Float64})
   @ GLM C:\Users\yangk\.julia\packages\GLM\P0Ris\src\glmfit.jl:61
 [4] fit(::Type{GeneralizedLinearModel}, X::Matrix{Float64}, y::Vector{Float64}, d::Poisson{Float64}, l::LogLink; dofit::Bool, wts::Vector{Float64}, offset::Vector{Float64}, fitargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ GLM C:\Users\yangk\.julia\packages\GLM\P0Ris\src\glmfit.jl:488
 [5] fit (repeats 2 times)
   @ C:\Users\yangk\.julia\packages\GLM\P0Ris\src\glmfit.jl:484 [inlined]
 [6] fit(::Type{GeneralizedLinearModel}, f::FormulaTerm{Term, Tuple{Term, Term}}, data::DataFrame, args::Poisson{Float64}; contrasts::Dict{Symbol, Any}, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ StatsModels C:\Users\yangk\.julia\packages\StatsModels\qY2YS\src\statsmodel.jl:88
 [7] fit(::Type{GeneralizedLinearModel}, f::FormulaTerm{Term, Tuple{Term, Term}}, data::DataFrame, args::Poisson{Float64})
   @ StatsModels C:\Users\yangk\.julia\packages\StatsModels\qY2YS\src\statsmodel.jl:82
 [8] top-level scope
   @ c:\julialang\Question_poisson_glm.txt:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant