Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update quickstart notebook with corrected typos and WLS #522

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

prteek
Copy link

@prteek prteek commented Jun 20, 2024

The quickstart guide in the documentation has been updated. New imports, execution code and guide for WLS have been added for better illustration of the library usage.

The quickstart guide in the documentation has been updated. New imports, execution code and guide for WLS have been added for better illustration of the library usage.
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@prteek
Copy link
Author

prteek commented Jun 20, 2024

I have created this PR for your review @s3alfisc I mean to also add plot at the end comparing prediction and prediction intervals of OLS and WLS and wanted to check if there's a way these can be generated using the library or will need to be manually computed ?

Copy link

codecov bot commented Jun 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

see 28 files with indirect coverage changes

@s3alfisc
Copy link
Member

Thank you @prteek! I'm time constraint tonight but will take a closer look tomorrow evening / Saturday morning. Did you manage to build the docs locally? For some reason, the github action seems to fail :/

@s3alfisc s3alfisc added docs Improvements or additions to documentation enhancement New feature or request and removed enhancement New feature or request labels Jun 20, 2024
@s3alfisc s3alfisc linked an issue Jun 20, 2024 that may be closed by this pull request
@prteek
Copy link
Author

prteek commented Jun 21, 2024

Hi ! I get the same error in rendering step locally (build step works fine). Seems to be related to quarto-dev/quarto-cli#9255 but I can't quite figure it out for my lack of experience with Quarto.

…wn cells were re written but are exactly same as earlier.
@prteek
Copy link
Author

prteek commented Jun 21, 2024

@s3alfisc able to build and render docs now. Still not clear on why this happened but apparently something in notebook metadata goes wrong when adding a markdown cell using PyCharm. I've re written those cells in Jupyter lab and it seems to fix this issue.

@prteek
Copy link
Author

prteek commented Jun 25, 2024

@s3alfisc this is largely done. If you could please review and approve the workflow that'd be awesome. Also wanted to understand if there is a way in library to get prediction intervals easily ?

"cell_type": "markdown",
"metadata": {},
"source": [
"## Weighted Lease Squares (WLS)\n",
Copy link
Member

@s3alfisc s3alfisc Jun 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires that the weights that are passed in are proportional to the inverse of the error variance.   		

This is technically not true and something that is very confusing in the statsmodels docs. The statsmodels docs discuss Feasible Weighted Least Squares for the purpose of correcting for heteroskedasticity (because the relation between dependent variable and covariates is quadratic, but we fit a linear model, we have homoskedasticity by definition).

But one can simply fit a model with weights being applied to different observations. I.e. from wikipedia, we can simply minimize the following least squares objective function

image

Optimally, I would like us to start with a very brief section that practically says "you can apply weights, here is how you do it using either frequency weights or precision weights".

Here we could show an example such as this:

import pyfixest as pf 

data = pf.get_data(model = "Fepois")[["X1","Y", "f1"]]
data.head()
# data of size N = 1000
print(data.shape)
data_agg = data.groupby(["X1","Y", "f1"]).size().reset_index().rename(columns = {0: "count"})
data_agg.head()
# data of size N << 1000: useful for reducing memory 
print(data_agg.shape)

fit_ols = pf.feols("Y ~ X1 | f1", data = data, vcov = "iid")
fit_wls = pf.feols("Y ~ X1 | f1", data = data_agg, vcov = "iid", weights = "count", weights_type = "fweights")
pf.etable([fit_ols, fit_wls])
# identical results and standard errors

And also link to the "Causal Inference for the Brave and True" chapter on WLS.

And then we can go on to motivate Feasible WLS a little bit better than statsmodels. In this case, I would like to add a sentence below the plot that explains that the conditional Variance $Var(Y|X)$ is increasing / changing in X - in other words, we have heteroskedasticity. And to tackle heteroskedasticity, we can either use robust standard errors or use our knowledge of the data generating process by weighting our regression model appropriately. And then we compare the WLS standard errors with the HC robust standard errors.

@s3alfisc
Copy link
Member

Hi @prteek , please apologize that I've made you wait on a review! It's simply been very busy days for me. I added quite a few comments in the notebook. I hope that all comments are clear - if not, please ask; and if you disagree, complain! =)
Best, Alex

@prteek
Copy link
Author

prteek commented Jun 27, 2024

Hi @prteek , please apologize that I've made you wait on a review! It's simply been very busy days for me. I added quite a few comments in the notebook. I hope that all comments are clear - if not, please ask; and if you disagree, complain! =) Best, Alex

No worries and thanks for the tips there ;-) . I'll give it another go just need a couple more days before I can start.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Docs: Document support for WLS in quickstart.ipynb
3 participants