ValueError：可迭代预期的原始文本文档，收到的字符串对象。使用 tfidf 和选择功能预测新测试数据

Question

ValueError：可迭代预期的原始文本文档，收到的字符串对象。使用 tfidf 和选择功能预测新测试数据

Sdt*_*dtv 1 python machine-learning pandas scikit-learn tensorflow

所以我用 sklearn 朴素贝叶斯分类器构建了一个模型。我需要知道如何通过输入预测句子

当我只是对句子进行硬编码时，它工作正常，看起来像这样

new_sentence = ['its so broken']
new_testdata_tfidf= tfidf.transform(new_sentence) 
#transform it to matrix to see the score TFIDF on the training data
fit_feature_selection = selection.transform(new_testdata_tfidf) 
#transform the new data to see if the feature remove or not, because after tfidf i use chi2 selection feature.
predicted = classifier.predict(feature_selection )
#then predict it. the classificaiton out, the class is -1 which is the correct answer

Run Code Online (Sandbox Code Playgroud)

我需要用手输入文本数据作为输入，所以我这样使用

new_sentence = input[('')] 
#i input the same sentence its so broken 
new_testdata_tfidf= tfidf.transform(new_sentence) 
#transform it to matrix to see the score TFIDF on the training data
fit_feature_selection = selection.transform(new_testdata_tfidf) 
#transform the new data to see if the feature remove or not, because after tfidf i use chi2 selection feature.
predicted = classifier.predict(feature_selection )

Run Code Online (Sandbox Code Playgroud)

但它给了我输出

  File "C:\Users\Myfile\OneDrive\Desktop\model.py", line 170, in <module>
   new_testdata_tfidf= tfidf.transform(new_sentence) 

  File "E:\anaconda3\lib\site-packages\sklearn\feature_extraction\text.py", line 1898, in transform
    X = super().transform(raw_documents)

  File "E:\anaconda3\lib\site-packages\sklearn\feature_extraction\text.py", line 1265, in transform
    "Iterable over raw text documents expected, "

ValueError: Iterable over raw text documents expected, string object received.

Run Code Online (Sandbox Code Playgroud)

如何解决这个问题？任何帮助真的很感激。

Answer 1

Dav*_*bii 7

您是否尝试过将新句子作为数组传递？IE

new_testdata_tfidf= tfidf.transform([new_sentence])

Run Code Online (Sandbox Code Playgroud)

第一个实例是传递一个带有一个字符串元素的数组，另一个实例只是传递一个字符串

归档时间：	5 年，8 月前
查看次数：	4301 次
最近记录：	5 年，8 月前