我目前正在尝试使用逻辑回归创建二元分类。目前我正在确定特征重要性。我已经进行了数据预处理(一次热编码和采样)并使用 XGBoost 和 RandomFOrestClassifier 运行它,没问题
但是,当我尝试拟合 LogisticRegression 模型时(以下是我在 Notebook 中的代码),
from sklearn.linear_model import LogisticRegression
#Logistic Regression
# fit the model
model = LogisticRegression()
# fit the model
model.fit(np.array(X_over), np.array(y_over))
# get importance
importance = model.coef_[0]
# summarize feature importance
df_imp = pd.DataFrame({'feature':list(X_over.columns), 'importance':importance})
display(df_imp.sort_values('importance', ascending=False).head(20))
# plot feature importance
plt.bar(list(X_over.columns), importance)
plt.show()
Run Code Online (Sandbox Code Playgroud)
它给出了一个错误
...
~\AppData\Local\Continuum\anaconda3\lib\site-packages\joblib\parallel.py in <listcomp>(.0)
223 with parallel_backend(self._backend, n_jobs=self._n_jobs):
224 return [func(*args, **kwargs)
--> 225 for func, args, kwargs in self.items]
226
227 def __len__(self):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\sklearn\linear_model\_logistic.py in _logistic_regression_path(X, …Run Code Online (Sandbox Code Playgroud) 我在 matplotlib 的子图中显示 OpenCV 图像时遇到问题
#Read random images from multiple directories
import random
animals = os.listdir('signs/train')
sample_images = []
for a in animals:
dirname = 'signs/train/' + a
files = random.sample(os.listdir(dirname), 5)
files = [dirname + '/' + im for im in files]
sample_images.extend(files)
del files, dirname, animals
print(sample_images)
# Output: ['signs/train/rooster/00000327.jpg', 'signs/train/rooster/00000329.jpg', 'signs/train/rooster/00000168.jpg', ..., 'signs/train/rooster/00000235.jpg', 'signs/train/rooster/00000138.jpg']
#Read using OpenCV and show in matplotlib's subplots
fig, ax = plt.subplots(12, 5,figsize=(15,15), sharex=True)
for idx, si in enumerate(sample_images):
i = idx % …Run Code Online (Sandbox Code Playgroud)