import nltk
import random
from nltk.corpus import movie_reviews
documents=[(list(movie_reviews.words(fileid)),category)
for category in movie_reviews.categories()
for fileid in movie_reviews.fileids(category)]
random.shuffle(documents)
#print(documents[1])
all_words=[]
for w in movie_reviews.words():
all_words.append(w.lower())
all_words=nltk.FreqDist(all_words)
word_features = list(all_words.keys())[:3000]
def find_features(document):
words = set(document)
features=[]
for w in word_features:
features[w]= (w in words)
return features
print((find_features(movie_reviews.words('neg/cv000_29416.txt'))))
featuresets = [(find_features(rev), category) for (rev,category) in documents]
Run Code Online (Sandbox Code Playgroud)
运行后,我收到错误
features[w]= (w in words)
TypeError: list indices must be integers, not str
Run Code Online (Sandbox Code Playgroud)
请帮我解决一下......
我正在尝试在我安装的 ubuntu 服务器中将 pdf 文件转换为图像文件:
我的代码:
from pdf2image import convert_from_path, convert_from_bytes
images = convert_from_path("/home/user/pdf_file.pdf")
# OR
with open("/home/user/pdf_file.pdf") as pdf:
images = convert_from_bytes(pdf.read())
Run Code Online (Sandbox Code Playgroud)
输出
当我使用函数“convert_from_path”时
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in convert_from_path
thread_output_file = next(output_file)
TypeError: ThreadSafeGenerator object is not an iterator
Run Code Online (Sandbox Code Playgroud)
当我使用函数“convert_from_bytes”时
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 268, in convert_from_bytes
paths_only=paths_only,
File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in …Run Code Online (Sandbox Code Playgroud) 最常用的单词列表输出如下:
[('电影', 904), ('电影', 561), ('one', 379), ('like', 292)]
我想要一个根据数字对每个单词使用 matplotlib 的图形
请帮我
python ×3
converters ×1
counter ×1
find ×1
graph ×1
image ×1
matplotlib ×1
movie ×1
pdf ×1
python-3.x ×1
review ×1
typeerror ×1