ValueError:可迭代预期的原始文本文档,收到的字符串对象。使用 tfidf 和选择功能预测新测试数据

Sdt*_*dtv 1 python machine-learning pandas scikit-learn tensorflow

所以我用 sklearn 朴素贝叶斯分类器构建了一个模型。我需要知道如何通过输入预测句子

当我只是对句子进行硬编码时,它工作正常,看起来像这样

new_sentence = ['its so broken']
new_testdata_tfidf= tfidf.transform(new_sentence) 
#transform it to matrix to see the score TFIDF on the training data
fit_feature_selection = selection.transform(new_testdata_tfidf) 
#transform the new data to see if the feature remove or not, because after tfidf i use chi2 selection feature.
predicted = classifier.predict(feature_selection )
#then predict it. the classificaiton out, the class is -1 which is the correct answer
Run Code Online (Sandbox Code Playgroud)

我需要用手输入文本数据作为输入,所以我这样使用

new_sentence = input[('')] 
#i input the same sentence its so broken 
new_testdata_tfidf= tfidf.transform(new_sentence) 
#transform it to matrix to see the score TFIDF on the training data
fit_feature_selection = selection.transform(new_testdata_tfidf) 
#transform the new data to see if the feature remove or not, because after tfidf i use chi2 selection feature.
predicted = classifier.predict(feature_selection )
Run Code Online (Sandbox Code Playgroud)

但它给了我输出

  File "C:\Users\Myfile\OneDrive\Desktop\model.py", line 170, in <module>
   new_testdata_tfidf= tfidf.transform(new_sentence) 

  File "E:\anaconda3\lib\site-packages\sklearn\feature_extraction\text.py", line 1898, in transform
    X = super().transform(raw_documents)

  File "E:\anaconda3\lib\site-packages\sklearn\feature_extraction\text.py", line 1265, in transform
    "Iterable over raw text documents expected, "

ValueError: Iterable over raw text documents expected, string object received.
Run Code Online (Sandbox Code Playgroud)

如何解决这个问题?任何帮助真的很感激。

Dav*_*bii 7

您是否尝试过将新句子作为数组传递?IE

new_testdata_tfidf= tfidf.transform([new_sentence])
Run Code Online (Sandbox Code Playgroud)

第一个实例是传递一个带有一个字符串元素的数组,另一个实例只是传递一个字符串