我在使用 statsmodels 的 get_margeff 命令处理具有交互项的 logit 模型时遇到问题。虽然在主效应模型中,效应被正确计算并对应于 Stata 和 R 结果,但当涉及交互项时情况并非如此。这里的效果是错误的,并且还报告了交互项的边际效果,这是没有意义的。以下代码说明了这一点:
import pandas as pd
import statsmodels.formula.api as sm
import statsmodels.api as sm2
df=sm2.datasets.heart.load_pandas().data
regression = sm.logit(formula='censors~survival+age', data=df).fit()
#only for illustration purposes; does not make real sense
print(regression.get_margeff().summary())
# the calculation of marginal effects here is corrects and corresponds to Stata and R results
Run Code Online (Sandbox Code Playgroud)
dy/dx std err z P>|z| [0.025 0.975]
------------------------------------------------------------------------------
survival -0.0004 7.95e-05 -4.672 0.000 -0.001 -0.000
age 0.0148 0.005 3.262 0.001 0.006 0.024
==============================================================================
Run Code Online (Sandbox Code Playgroud)
regression = sm.logit(formula='censors~survival+age+survival*age', …Run Code Online (Sandbox Code Playgroud)