小编use*_*731的帖子

Python中的logit回归和奇异Matrix错误

我正在尝试对德国信用数据运行logit回归(www4.stat.ncsu.edu/~boos/var.select/german.credit.html).为了测试代码,我只使用了数值变量,并尝试使用以下代码对结果进行回归.

import pandas as pd
import statsmodels.api as sm
import pylab as pl
import numpy as np

df = pd.read_csv("germandata.txt",delimiter=' ')
df.columns = ["chk_acc","duration","history","purpose","amount","savings_acc","employ_since","install_rate","pers_status","debtors","residence_since","property","age","other_plans","housing","existing_credit","job","no_people_liab","telephone","foreign_worker","admit"]

#pls note that I am only retaining numeric variables
cols_to_keep = ['admit','duration', 'amount', 'install_rate','residence_since','age','existing_credit','no_people_liab']

# rank of cols_to_keep is 8
print np.linalg.matrix_rank(df[cols_to_keep].values)
data = df[cols_to_keep]

data['intercept'] = 1.0

train_cols = data.columns[1:]

#to check the rank of train_cols, which in this case is 8
print np.linalg.matrix_rank(data[train_cols].values)

#fit logit model
logit = sm.Logit(data['admit'], data[train_cols])
result = logit.fit()
Run Code Online (Sandbox Code Playgroud)

检查数据时,所有8.0列都是独立的.尽管如此,我得到奇异矩阵错误.你能帮忙吗? …

regression python-2.7 statsmodels

11
推荐指数
2
解决办法
2万
查看次数

标签 统计

python-2.7 ×1

regression ×1

statsmodels ×1