我的代码有问题,我想查看 word2vec 模型中向量的特征重要性,但我不能,因为它是一个管道。有人可以帮我找到解决方案吗?
## Import the random forest model.
from sklearn.ensemble import RandomForestClassifier
## This line instantiates the model.
rf = Pipeline([
("word2vec vectorizer", MeanEmbeddingVectorizer(w2v)),
("Random_forest", RandomForestClassifier(n_estimators=100, max_depth=6,random_state=0))])
## Fit the model on your training data.
rf.fit(X_train, y_train)
## And score it on your testing data.
rf.score(X_test, y_test)
X = model.wv.syn0
X = X.astype(int)
def plot_feat_imp(model, X):
Feature_Imp = pd.DataFrame([X, rand_w2v_tfidf.feature_importances_]).transpose(
).sort_values(1, ascending=False)
plt.figure(figsize=(14, 7))
sns.barplot(y=Feature_Imp.loc[:, 0], x=Feature_Imp.loc[:, 1], data=Feature_Imp, orient='h')
plt.title("Importance des variables (qu'est ce qui explique le mieux la satisfaction)", fontsize=21)
plt.show()
return
MY PROBLEM IS HERE
AttributeError: 'Pipeline' object has no attribute 'feature_importances_'
plot_feat_imp(gbc_w2v, X)
Run Code Online (Sandbox Code Playgroud)
小智 2
也许不是您正在寻找的答案,但如果您想要管道对象的feature_importances_,您可能需要首先进入最佳分类器。
这可以通过以下方式实现:
rf_fit = rf.fit(X_train, y_train)
feature_importances = rf_fit.best_estimator_._final_estimator.feature_importances_
Run Code Online (Sandbox Code Playgroud)
希望有帮助。
| 归档时间: |
|
| 查看次数: |
8396 次 |
| 最近记录: |