Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

anova doesnt flag different sample sizes and instead returns p value #158

Open
ngalanter opened this issue Nov 16, 2024 · 1 comment
Open
Assignees

Comments

@ngalanter
Copy link

The anova function for regress models doesn't report an error if there are different sample sizes between the full and reduced models, which can occur if a variable in the full and not the reduced model has missingness. Instead a (very low) p-value is returned. This is different than the behavior of the anova function and lmtest::lrtest for glm models, which both report errors.

Here is an example:

library(rigr)
library(lmtest)

set.seed(236)

x1 <- as.factor(rep(1:4,25))

x1[1:10] <- NA

x2 <- rnorm(100)

y <- rbinom(size = 1, n = 100, p = 1/(1+exp(0.5*x2)))

dat <- data.frame(x1=x1, x2=x2, y=y)

mod_full_regress <- regress("odds", y ~ x1+x2,data = dat)
mod_reduced_regress <- regress("odds", y ~ x2, data = dat)

mod_full_regress

anova(mod_reduced_regress,mod_full_regress,test = "Wald")
anova(mod_reduced_regress,mod_full_regress,test = "LRT")

mod_full_glm <- glm(y ~ x1+x2, data = dat, family = binomial)
mod_reduced_glm <- glm(y ~ x2, data = dat, family = binomial)

anova(mod_reduced_glm,mod_full_glm,test = "LRT")

lrtest(mod_reduced_glm,mod_full_glm)
@adw96
Copy link
Contributor

adw96 commented Nov 16, 2024

Thanks, @ngalanter . If you have a pull request that addresses this issue, I would can review it, but I won't be able to address it myself for some time. Thanks for understanding.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants