Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OneWayANOVATest seems to differ with different group sizes #242

Open
andevellicus opened this issue Aug 2, 2021 · 1 comment
Open

OneWayANOVATest seems to differ with different group sizes #242

andevellicus opened this issue Aug 2, 2021 · 1 comment

Comments

@andevellicus
Copy link

I'm a bit of a stats noob so I'm not sure if this is expected behavior or not, but when using GLM's ftest to do ANOVA calculations, and HypothesisTests OneWayANOVATest, I get different results with different sample sizes (see below)

MWE:

using GLM, StatsKit
# This set of data results in the same F-score and p-value
#=
g1 = [51, 87, 50, 48, 79, 61, 53, 54]
g2 = [82, 91, 92, 80, 52, 79, 73, 74]
g3 = [79, 84, 74, 98, 63, 83, 85, 58]
g4 = [85, 80, 65, 71, 67, 51, 63, 93]
=#

# This set of data results in a different F-score and p-value
g1 = [51, 87, 50, 48, 79, 61, 53]
g2 = [82, 91, 92, 80, 52, 79, 73, 74]
g3 = [79, 84, 74, 98, 63, 83, 85, 58]
g4 = [85, 80, 65, 71, 67, 51]

groups = [g1, g2, g3, g4]
println(OneWayANOVATest(groups...))

df1 = DataFrame(method = 0, scores=g1)
df2 = DataFrame(method = 1, scores=g2)
df3 = DataFrame(method = 2, scores=g3)
df4 = DataFrame(method = 3, scores=g4)

df = vcat(df1, df2, df3, df4)
df.method = categorical(df.method)

nullModel = lm(@formula(scores ~ 1), df)
methodModel = lm(@formula(scores ~ method), df)

println(nullModel)
println(methodModel)
ftest(nullModel.model, methodModel.model)
@wildart
Copy link
Contributor

wildart commented Aug 4, 2021

It's a bug. The overall sample mean computed as mean of means, which is true only if samples of the equal size.

Z̄ᵢ = mean.(scores)
= mean(Z̄ᵢ)

Fix:

= sum(Iterators.flatten(groups))/sum(Nᵢ)

tbenst added a commit to tbenst/HypothesisTests.jl that referenced this issue Sep 10, 2021
wildart added a commit to wildart/HypothesisTests.jl that referenced this issue May 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants