我在线性模型中不断出现此错误:
不支持将字符串转换为float
具体来说,错误在这一行:
results = m.evaluate(input_fn=lambda: input_fn(df_test), steps=1)
Run Code Online (Sandbox Code Playgroud)
如果有帮助,这里是堆栈跟踪:
File "tensorflowtest.py", line 164, in <module>
m.fit(input_fn=lambda: input_fn(df_train), steps=int(100))
File "/home/computer/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/linear.py", line 475, in fit
max_steps=max_steps)
File "/home/computer/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 333, in fit
max_steps=max_steps)
File "/home/computer/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 662, in _train_model
train_op, loss_op = self._get_train_ops(features, targets)
File "/home/computer/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 963, in _get_train_ops
_, loss, train_op = self._call_model_fn(features, targets, ModeKeys.TRAIN)
File "/home/computer/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 944, in _call_model_fn
return self._model_fn(features, targets, mode=mode, params=self.params)
File "/home/computer/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/linear.py", line 220, in _linear_classifier_model_fn
loss = loss_fn(logits, targets)
File "/home/computer/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/linear.py", line …Run Code Online (Sandbox Code Playgroud) 当我在我的程序中运行mean_acc()方法时,有%(min_groups,self.n_splits)),警告)错误...
def mean_acc():
models = [
RandomForestClassifier(n_estimators=200, max_depth=3, random_state=0),
LinearSVC(),
MultinomialNB(),
LogisticRegression(random_state=0)]
CV = 6
cv_df = pd.DataFrame(index=range(CV * len(models)))
entries = []
for model in models:
model_name = model.__class__.__name__
accuracies = cross_val_score(model, features, labels, scoring='accuracy', cv=CV)
for fold_idx, accuracy in enumerate(accuracies):
entries.append((model_name, fold_idx, accuracy))
cv_df = pd.DataFrame(entries, columns=['model_name', 'fold_idx', 'accuracy'])
print(cv_df.groupby('model_name').accuracy.mean())
Run Code Online (Sandbox Code Playgroud)
这些是我使用mean_acc()方法运行程序时显示的错误.我可以知道如何在下面解决这些错误吗?请帮助我看看上面导致这些错误的代码,谢谢!
% (min_groups, self.n_splits)), Warning)
C:\Users\L31307\PycharmProjects\FYP\venv\lib\site-packages\sklearn\model_selection\_split.py:626: Warning: The least populated class in y has only 1 members, which is too few. The minimum number of members in any …Run Code Online (Sandbox Code Playgroud) 我正在使用 linearmodels 包来估计 Panel-OLS。例如,请参见:
import numpy as np
from statsmodels.datasets import grunfeld
data = grunfeld.load_pandas().data
data.year = data.year.astype(np.int64)
# MultiIndex, entity - time
data = data.set_index(['firm','year'])
from linearmodels import PanelOLS
mod = PanelOLS(data.invest, data[['value','capital']], entity_effect=True)
res = mod.fit(cov_type='clustered', cluster_entity=True)
Run Code Online (Sandbox Code Playgroud)
我想在 .tex 文件中导出回归的输出。是否有一种方便的方法可以在没有 CI 之类的其他信息的情况下使用置信度来格式化输出?该问题已在此处的标准 OLS 上下文中提出,但这不适用于“PanelEffectsResults”对象,因为我收到以下错误:
'PanelEffectsResults' object has no attribute 'bse'
Run Code Online (Sandbox Code Playgroud)
提前致谢。
我将以下面板存储在df:
| 状态 | 区 | 年 | y | 持续的 | x1 | x2 | 时间 | |
|---|---|---|---|---|---|---|---|---|
| 0 | 01 | 01001 | 2009年 | 12 | 1 | 0.956007 | 639673 | 1 |
| 1 | 01 | 01001 | 2010年 | 20 | 1 | 0.972175 | 639673 | 2 |
| 2 | 01 | 01001 | 2011年 | 22 | 1 | 0.988343 | 639673 | 3 |
| 3 | 01 | 01002 | 2009年 | 0 | 1 | 0 | 33746 | 1 |
| 4 | 01 | 01002 | 2010年 | 1 | 1 | 0.225071 | 33746 | 2 |
| 5 | 01 | 01002 | 2011年 | 5 | 1 | 0.450142 | 33746 | 3 |
| 6 | 01 | 01003 | 2009年 | 0 | 1 | 0 | 45196 | 1 |
| 7 | 01 | 01003 | 2010年 | 5 | 1 … |
我正在PanelOLS从linearmodels包中运行一个。
与经常发生的情况一样,缺少一些观察结果。当我在R(我认为等效命令是plm)中运行等效命令时,我得到以下信息:
Unbalanced Panel: n=11, T=17-61, N=531
Run Code Online (Sandbox Code Playgroud)
所以面板是不平衡的:有些人只有 17 个时间段的完整数据,而其他人则有更多。但是回归仍然运行。
等效的python命令是:
import linearmodels.panel as pnl
model = pnl.PanelOLS.from_formula(formula, data=src)
Run Code Online (Sandbox Code Playgroud)
这给了我一个警告:
输入包含缺失值。删除缺少观察的行。
还有一个错误:
MyPythonInstallation\lib\site-packages\linearmodels\panel\model.py in _validate_data(self)
207
208 if matrix_rank(x) < x.shape[1]:
--> 209 raise ValueError('exog does not have full column rank.')
210 self._constant, self._constant_index = has_constant(x)
211
ValueError: exog does not have full column rank.
Run Code Online (Sandbox Code Playgroud)
我该如何继续我的回归?
我试图理解 Python statsmodel 包提供的混合线性模型的结果。我想避免数据分析和解释中的陷阱。问题在数据加载/输出代码块之后。
加载数据并拟合模型:
import statsmodels.api as sm
import statsmodels.formula.api as smf
data = sm.datasets.get_rdataset("dietox", "geepack").data
md = smf.mixedlm("Weight ~ Time", data, groups=data["Pig"])
mdf = md.fit()
print mdf.summary()
Mixed Linear Model Regression Results
========================================================
Model: MixedLM Dependent Variable: Weight
No. Observations: 861 Method: REML
No. Groups: 72 Scale: 11.3669
Min. group size: 11 Likelihood: -2404.7753
Max. group size: 12 Converged: Yes
Mean group size: 12.0
--------------------------------------------------------
Coef. Std.Err. z P>|z| [0.025 0.975]
--------------------------------------------------------
Intercept 15.724 0.788 19.952 0.000 14.179 …Run Code Online (Sandbox Code Playgroud) 我正在运行一个固定效果面板回归,使用线性模型 4.5 中的 PanelOLS() 函数。
在尝试在模型估计中添加 'entity_effects=True' 和 'time_effects=True' 时,它返回了 'AbsorbingEffectError':
模型无法估计。包含的效应已完全吸收了一个或多个变量。当使用模型中包含的效应完美解释一个或多个因变量时,就会发生这种情况。
如何修复“AbsorbingEffectError”?
panel = panel.set_index(['firm', 'Date'])
exog_vars = panel[['ex_mkt', 'MV', 'ROA', 'BTM','leverage','2nd']]
exog = sm.add_constant(exog_vars)
y = panel[['ex_firm']]
model = PanelOLS(y, exog_vars,entity_effects=True).fit(cov_type='clustered', cluster_entity=True)
Run Code Online (Sandbox Code Playgroud)
我遵循与文档中的固定效果模型示例完全相同的步骤https://bashtage.github.io/linearmodels/doc/panel/examples/examples.html#
我需要用于 2 路聚类的线性模型,这在 statsmodels 中没有正确实现。我想知道是否可以将 stargazer python 库与 linearmodels 包一起使用,而不是与 statsmodels 一起使用。但是当我从线性模型插入模型时,它会抛出一个错误:请使用经过训练的 OLS 模型作为输入
例子:
from linearmodels.panel import PanelOLS
import pandas as pd
df.set_index(['entity', 'time'], inplace = True)
X = df[["Exog1","Exog2","Exog3"]]
y = df["Dep"]
model = PanelOLS(y, X, entity_effects=True, time_effects=True).fit(cov_type='clustered', cluster_entity=True, cluster_time=True)
print(model)
Run Code Online (Sandbox Code Playgroud)
这将按预期输出模型。但是,当我将其插入 int stargazer 时,它会引发以下错误
stargazer = Stargazer([model])
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-149-75027b8621a2> in <module>
----> 1 stargazer = Stargazer([model])
~\AppData\Local\Continuum\anaconda3\lib\site-packages\stargazer\stargazer.py in __init__(self, models)
29 self.models = models
30 self.num_models = len(models)
---> 31 self.extract_data() …Run Code Online (Sandbox Code Playgroud) 我试图创建一个循环来找出装有 Ridge 回归模型的波士顿住房数据集的训练集和测试集的准确度分数的变化。
这是 for 循环:
for i in range(1,20):
Ridge(alpha = 1/(10**i)).fit(X_train,y_train)
Run Code Online (Sandbox Code Playgroud)
它显示了从 i=13 开始的警告。
警告是:
LinAlgWarning: Ill-conditioned matrix (rcond=6.45912e-17): result may not be accurate.
overwrite_a=True).T
Run Code Online (Sandbox Code Playgroud)
这个警告是什么意思?有可能摆脱它吗?
我检查没有循环单独执行它,仍然没有帮助。
#importing libraries and packages
import mglearn
from sklearn.model_selection import train_test_split
from sklearn.linear_model import Ridge
#importing boston housing dataset from mglearn
X,y = mglearn.datasets.load_extended_boston()
#Splitting the dataset
X_train,X_test,y_train,y_test = train_test_split(X,y,random_state=0)
#Fitting the training data using Ridge model with alpha = 1/(10**13)
rd = Ridge(alpha = 1/(10**13)).fit(X_train,y_train)
Run Code Online (Sandbox Code Playgroud)
不应为 i 的任何值显示上述警告。
linearmodels ×9
python ×9
statsmodels ×3
regression ×2
scikit-learn ×2
mixed-models ×1
model ×1
panel ×1
panel-data ×1
python-3.x ×1
stargazer ×1
statistics ×1
tensorflow ×1