Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difficulty fitting weights to glm model #61

Open
kkwi5241 opened this issue May 2, 2024 · 3 comments
Open

Difficulty fitting weights to glm model #61

kkwi5241 opened this issue May 2, 2024 · 3 comments

Comments

@kkwi5241
Copy link

kkwi5241 commented May 2, 2024

Hi there,
Thank you for making this great package - I am an R novice keen on using IPTW to weight logistical regression analysis, and have made progress much more quickly than I thought I would with WeightIt.

I have had some difficulty fitting my weights into a glm, it works with just two confounding variables, but when I add a third I get the error below which I have not been able to solve:

Warning: (from glm()) simpleWarning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Error in Xtreat %*% Btreat : non-conformable arguments

weights <- weightit(binary_treatment ~ factor_confounder1 + factor_cofounder2 + numeric_confounder,
                    data = dataframe, estimand = "ATT", method = "glm")

weightit_model <- glm_weightit(binary_outcome ~ binary_treatment * (factor_confounder1 + factor_cofounder2 + numeric_confounder),
               data = data frame, family = "binomial", weightit = weights)

Thanks for any help you can give.
BW,
Charlie

@ngreifer
Copy link
Owner

ngreifer commented May 2, 2024

That's a problem!

The first warning means that you have near-perfect separation in your data, i.e., a variable or combination thereof perfectly predicts treatment. You should use a different method, such as bias-reduced logistic regression (link = "br.logit") or energy balancing (method = "energy"), which are more robust to lack of overlap. What you are seeing is a problem with logistic regression as implemented in glm() and not WeightIt (that's why the error says "from glm()).

The second error, which is more problematic, is a bug in WeightIt, and I would love to have access to your dataset so I can properly diagnose it. If it's sensitive data, you can rename all the columns, recode the variables (e.g., multiplying the numeric variables by a constant and recoding the factor levels to have meaningless labels), and just include the variables used in the analysis (i.e., not the entire dataset). I was planning on submitting an update to WeightIt today so I want to make sure I squash this bug ASAP!

Thanks, and sorry for the confusion it might have caused.

@kkwi5241
Copy link
Author

kkwi5241 commented May 2, 2024

Thanks very much for getting back to me so quickly, the warning does make sense given the data I am looking - I am a UK surgical academic working with a relatively small real patient dataset. Would it be possible to liaise over email? I am contactable at [email protected]
BW,
Charlie

@ngreifer
Copy link
Owner

ngreifer commented May 2, 2024

Hi Charlie,

Thank you so much for your assistance over email. Here is the problem.

There is nothing wrong with the weighting model, and I was incorrect in assuming the problem was due to perfect separation of the treatment by the covariates. That said, you have fundamental imbalance in your covariates that cannot be rectified using weighting. Your groups are too small and there is not enough overlap between to make a valid inference on the ATT. One option is to change to a different estimand, like the ATO.

This issue was with your outcome model. There was indeed a bug in WeightIt, which I fixed. However, even with the bug fixed, it is not possible to fit your outcome model. That is because in the treated group, there is only a single event. Your sample is too small to make any valid inference on, especially after weighting. In an upcoming version of WeightIt, I have produced a cleaner error message that provides advice on how to diagnose the error. In this case, the error is subtle and can only be solved by collecting more data. These simply are not small-sample methods. I would recommend consulting with a medical statistician on how to extract some useful information from your sample. It might be that your analysis must be purely descriptive, as it is not big enough to appropriately adjust for confounding. I think a simpler, regression-based analysis specifically using methods designed for small samples (e.g., Firth logistic regression) could be effective. But I would advice you stay away from propensity score methods with this sample.

Noah

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants