Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

F statistic and r squared for lm_weightit() and glm_weightit()? #59

Open
jfordal opened this issue Apr 23, 2024 · 4 comments
Open

F statistic and r squared for lm_weightit() and glm_weightit()? #59

jfordal opened this issue Apr 23, 2024 · 4 comments

Comments

@jfordal
Copy link

jfordal commented Apr 23, 2024

I am a grad student using WeightIt for the first time and noticed that the summary output for the lm_weightit () and glm_weightit() functions does not include the F statistic/r squared/adj. r squared. I am using g-computation and reporting only those results (e.g. not interpreting the coefficients of the outcome model itself); however, I would still be interested to see these for the models after weighting.

Thank you in advance, and also for developing such a great package.

@ngreifer
Copy link
Owner

I don't provide this info because I don't think it has any valid statistical use. I'm glad you're not reporting it, but I would need to be convinced of its utility at all to be included in WeightIt. I will tell you how to get that info, though, since it isn't reported.

To get the omnibus F-statistic, you can run a joint test on all the coefficients other than the intercept. The F-statistic tests whether all coefficients in the model are equal to 0. marginaleffects::hypotheses() can do this, as does car::lht(). You can also do this manually using the coefficients and asymptotic covariance matrix. A reason I don't think you should do this is that this involves testing whether covariate coefficients differ from 0, but that is not a relevant test (i.e., the covariate coefficients don't have any meaningful interpretation and their inclusion in the outcome model should not be based on significance).

To get the R2, just re-fit the model using lm() with weights = w.out$weights. None of the tests or standard errors should be interpreted, but the model R2 is the same as would be computed by lm_weightit(), since the latter calls lm() under the hood anyway.

Thank you for the kind words!

@jfordal
Copy link
Author

jfordal commented Apr 30, 2024

Thank you for the detailed response! I interested specifically in the adjusted r squared for model fit. I ran two versions of the weighted model, one with interactions between my treatment/grouping variable and my covariates (as suggested in the WeightIt vignette), and one without interactions. The adjusted r squared was actually slightly lower for my no interaction model, which I am guessing is because my covariates are well balanced post-weighting (none of my interactions were significant in the model).

My study design is a pre-post with non-randomized (opt in) treatment/control groups, so I weighted using multiple demographic covariates that were previously unbalanced across the two groups, leaving out the pretest score. I added the pretest only into the final outcome model, per the recommendation from this book: https://us.sagepub.com/en-us/nam/using-propensity-scores-in-quasi-experimental-designs/book237237 "It is common practice that if pretest measures of the outcome are available, they are used as separate control variables. They may not be used as predictors of propensity scores…in addition, if the error terms of the pre-and post measures are correlated, then including the pretest as an estimator of the propensity score will bias the estimated causal effect" (p. 33). With the interaction model, my resulting ATT is slightly reduced but very close to the non-interaction version.

@ngreifer
Copy link
Owner

You should not compare models using adjusted R2 to decide how to model the outcome; doing so invalidates your statistical inference. Just use the fully interacted model. How well the covariates are balanced has nothing to do with the significance of covariates in the outcome model or their interaction with the treatment. The absence of interactions with the treatment suggests that the treatment effect does not vary across levels of the covariate. But it is critical to remember that the outcome model is completely uninterpretable; its sole purpose is to increase the precision of the effect estimate. If you want to test for the presence of effect modification by the covariates, you need to design a study and analysis specifically aimed at testing that.

I read the section in that textbook and I have to say it is complete nonsense. You should absolutely include the pre-test measure of the outcome both in the propensity score model and the outcome model. It is clearly one of the most important confounders to adjust for, and you should never include a variable in the outcome model that is not adjusted for in the propensity score model. The author does not cite any sources in making that claim and it reveals a fundamental misunderstanding of what covariate adjustment is and how it works.

@jfordal
Copy link
Author

jfordal commented Apr 30, 2024

I am only interested in the treatment effect for the purposes of this analysis, not effect modification by the covariates, so thank you for that advice on just using the fully interacted model.

I was also confused by the statement in my textbook, and I did a good bit of searching to see if others had addressed how to handle the pretest score in this scenario (most examples I saw either propensity weighted/matched using other covariates and had no pretest score available, or they did not use propensity weighting/matching and only adjusted for pretest using ANCOVA/regression). The guidance seemed pretty definitive in the textbook, so your feedback is helpful and validates my earlier concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants