AttributeError:“CountVectorizer”对象没有属性“get_feature_names”

DeV*_*r12 1 python machine-learning scikit-learn

该代码之前可以运行,没有显示任何错误。这是一个情感分析机器学习项目。该代码基于字数统计的逻辑回归模型:

c = CountVectorizer(stop_words = 'english')

def text_fit(X, y, model,clf_model,coef_show=1):
    
    X_c = model.fit_transform(X)
    print('# features: {}'.format(X_c.shape[1]))
    X_train, X_test, y_train, y_test = train_test_split(X_c, y, random_state=0)
    print('# train records: {}'.format(X_train.shape[0]))
    print('# test records: {}'.format(X_test.shape[0]))
    clf = clf_model.fit(X_train, y_train)
    acc = clf.score(X_test, y_test)
    print ('Model Accuracy: {}'.format(acc))
    
    if coef_show == 1: 
        w = model.get_feature_names()
        coef = clf.coef_.tolist()[0]
        coeff_df = pd.DataFrame({'Word' : w, 'Coefficient' : coef})
        coeff_df = coeff_df.sort_values(['Coefficient', 'Word'], ascending=[0, 1])
        print('')
        print('-Top 20 positive-')
        print(coeff_df.head(20).to_string(index=False))
        print('')
        print('-Top 20 negative-')        
        print(coeff_df.tail(20).to_string(index=False))
    
text_fit(X, y, c, LogisticRegression())
Run Code Online (Sandbox Code Playgroud)

我删除了该项目并创建了一个新项目,并且代码可以正常工作。但几天后,它再次开始显示相同的错误。

小智 5

根据文档,该方法被调用get_feature_names_out。尝试将问题行更改为:

w = model.get_feature_names_out()
Run Code Online (Sandbox Code Playgroud)

  • 为了清楚起见,这[在 1.0 中发生了变化](https://scikit-learn.org/stable/auto_examples/release_highlights/plot_release_highlights_1_0_0.html#feature-names-support) (2认同)