Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction error for non-fe model #573

Open
wants to merge 45 commits into
base: master
Choose a base branch
from

Conversation

greenguy33
Copy link
Contributor

This pull request references issue #566

This is not a finished product but I took a first pass at implementing prediction standard error (stdp) within the predict function. As discussed it is only implemented for models without fixed effects for now. I'm creating this pull request to solicit feedback on the approach used. I confirmed that results returned are the same as statsmodels' get_prediction() function for a test dataset.

Note that CI's are not implemented, but the prediction error is returned. In order to implement CIs, we need to provide a z-score based on the percentage CI we want and then do some simple arithmetic. For the sake of keeping things simple and not adding too many new arguments to the predict() function, I left that out for now.

@s3alfisc
Copy link
Member

s3alfisc commented Aug 1, 2024

Looks very good so far! 👍😄

I guess at some point we might want to clean up the different concepts of predicted values, i.e. self._Y_hat, self._self._Y_untransformed, and I also think there is also self._Y_hat_link`?

I suppose next steps would be to implement some basic tests (i.e. against statsmodels), to add an example, and maybe update the docs? I am not sure if the predict method is even mentioned in quickstart.ipynb 🤔

@@ -1436,6 +1436,7 @@ def predict(
atol: float = 1e-6,
btol: float = 1e-6,
type: str = "link",
compute_stdp: bool = False
) -> np.ndarray:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I think that a few unit tests fail because the code assumes that predict returns a np.ndarray - if we change to a pd.DataFrame (which I think makes most sense), we need to change these.

Alternatively, we could change to -> Union[np.ndarray, pd.DataFrame] and simply return the numpy array if compute_stdp = False.

@greenguy33
Copy link
Contributor Author

Yep, I didn't touch the tests or docs yet but happy to do that as a next step!
I don't think predict is mentioned in the docs because when I started using the library I had to go digging for it :) I'll make sure to add that when I have a chance to make the next update.
As a github related sidenote, do you know if there is a way to squash commits on my end when I make a pull request from my fork? It feels very dirty to be submitting a pull request with 25 commits when only 3-4 of them are relevant to the current project.

@s3alfisc
Copy link
Member

s3alfisc commented Aug 1, 2024

As a github related sidenote, do you know if there is a way to squash commits on my end when I make a pull request from my fork? It feels very dirty to be submitting a pull request with 25 commits when only 3-4 of them are relevant to the current project.

Yes, I can do a squash commit on my end when merging, so many, many commits are no problem at all, I have this all the time :D
image

Yep, I didn't touch the tests or docs yet but happy to do that as a next step!
I don't think predict is mentioned in the docs because when I started using the library I had to go digging for it :) I'll make sure to add that when I have a chance to make the next update.

Sounds great!

Copy link

codecov bot commented Aug 31, 2024

Codecov Report

Attention: Patch coverage is 81.13208% with 10 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pyfixest/estimation/feols_.py 73.68% 5 Missing ⚠️
pyfixest/estimation/fepois_.py 0.00% 5 Missing ⚠️
Files with missing lines Coverage Δ
pyfixest/did/did2s.py 90.00% <100.00%> (+1.11%) ⬆️
tests/test_predict_resid_fixef.py 92.80% <100.00%> (+0.49%) ⬆️
tests/test_stdp.py 100.00% <100.00%> (ø)
pyfixest/estimation/feols_.py 86.95% <73.68%> (ø)
pyfixest/estimation/fepois_.py 84.81% <0.00%> (ø)

... and 24 files with indirect coverage changes

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@s3alfisc
Copy link
Member

pre-commit.ci autofix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants