Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save *both* estimated effects #392

Open
effedp opened this issue Feb 14, 2022 · 3 comments
Open

Save *both* estimated effects #392

effedp opened this issue Feb 14, 2022 · 3 comments

Comments

@effedp
Copy link

effedp commented Feb 14, 2022

Hey there,

I run the textbook example of PanelOLS with both "entity" and "time" FEs. I am trying to save the two estimated effects using linearmodels.panel.results.PanelEffectsResults.estimated_effects, but I obtain only one column in the output DataFrame, which does not even seem to resemble either of the two FEs (I am double checking everything with Stata's reghdfe).

from linearmodels.datasets import wage_panel
from linearmodels import PanelOLS

data = wage_panel.load()
year = pd.Categorical(data.year)
data = data.set_index(["nr", "year"])

exog_vars = ['expersq']
exog = sm.add_constant(data[exog_vars])

mod = PanelOLS(data.lwage, exog, entity_effects=True, time_effects=True)
result = mod.fit(cov_type='clustered')

result.estimated_effects

gives me the following output

nr		year		estimated_effects
13		1980		-0.993351
		1981		-0.835877
		1982		-0.728156
		1983		-0.620804
		1984		-0.479181
...		...		...
12548	        1983		-0.212544
		1984		-0.070921
		1985		0.059622
		1986		0.212194
		1987		0.382053

4360 rows × 1 columns

How can I save both the estimated effects?
Am I missing something, or is this a bug?

Thank you for your help!

@bashtage
Copy link
Owner

estimated_effects are the total combined effects included in the model. The method used to remove FE does not directly lead to a separate estimate of the effects. In balanced panels it should be easy to get them by using the estimated effects of the LHS variable in including only entity effects. The estimated effects from this auxiliary model will be the entity effects, and the residuals will be the time effects.

@effedp
Copy link
Author

effedp commented Feb 15, 2022

Hello Kevin and thanks for your reply. I am not sure it is that easy. If I got it correctly, your solution would imply that estimating a regression of a dependent variable on individual effects and then taking the residual as time effect is the same as a regression of the dependent variable on both time and individual effects, which is not the case.

Do you think we will see an option for a separate estimate of the effects soon? It would be handy for many applications, and there is no way to do it in Python to the best of my knowledge.

@alistaircameron
Copy link

Hello, as stated in his reply, Bashtage's method works for balanced panels, but not for unbalanced panels. Like the OP, I needed to extract both individual and time FEs in an unbalanced panel, here’s my quick and dirty solution, hope it helps others who stumble across this.

# Fit some model with PanelOLS
mod = PanelOLS(
    dependent = df['y'], 
    exog = exog, 
    entity_effects = True,
    time_effects = True
    )

twfe = mod.fit()

# Get the estimated effects.
ees = twfe.estimated_effects.__deepcopy__(False)
ees.reset_index(inplace = True)
ees.columns = ['individuals', 'time', 'estimated_effects']
ees = ees.drop_duplicates(subset = ['individuals', 'time'])
ees.reset_index(inplace = True, drop = True)


# Make a list of all possible years, a place to store year fixed effects, and a running sum.
time = np.sort(ees.time.unique())
time_fe, period, running_sum = [], [], 0

for t in range(len(time) - 1):
    # Find an individual with data recorded in the base year, b, AND in year b + 1. Stop.
    b, c = time[t], time[t+1]
    individuals_in_base_period = list(ees[(ees.time == b)].individuals)
    individuals_in_following_period = list(ees[(ees.time == c)].individuals)

    for i, j in enumerate(individuals_in_base_period):
        if j in individuals_in_following_period:
            ind = j 
            break
        else:
            if i == len(individuals_in_base_period):
                print("Try another method. Sorry.")

    # Calculate year b+1 fixed effect WITHIN the individual.
    year_year_diff = ees[(ees.individuals == j) & (ees.time == c)].estimated_effects.iloc[0] - ees[(ees.individuals == j) & (ees.time == b)].estimated_effects.iloc[0]
    time_fe.append(running_sum + year_year_diff)
    period.append(time[t+1])
    running_sum += year_year_diff

# Merge with the original df to get the individual fixed effects.
df_time = pd.DataFrame(period, time_fe)
df_time.reset_index(inplace = True)
df_time.columns = ['time_fe', 'time']

ees = ees.merge(df_time, how = "left", on = "time")
ees['individual_fe'] = ees['estimated_effects'] - ees['time_fe']

# And you've got time + individual FEs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants