我正在尝试计算大小为1.2 GB的文本文件的字频率,大约为2.03亿字.我使用以下Python代码.但它给了我一个内存错误.这有什么解决方案吗?
这是我的代码:
import re
# this one in honor of 4th July, or pick text file you have!!!!!!!
filename = 'inputfile.txt'
# create list of lower case words, \s+ --> match any whitespace(s)
# you can replace file(filename).read() with given string
word_list = re.split('\s+', file(filename).read().lower())
print 'Words in text:', len(word_list)
# create dictionary of word:frequency pairs
freq_dic = {}
# punctuation marks to be removed
punctuation = re.compile(r'[.?!,":;]')
for word in word_list:
# remove punctuation marks
word = punctuation.sub("", …Run Code Online (Sandbox Code Playgroud) 我正在尝试创建一个API,它将图像URL作为输入,并返回JSON格式的调色板作为输出.
它应该是这样的:http://lokeshdhakar.com/projects/color-thief/
但应该是在Python中.我已经研究过PIL(Python图像库),但没有得到我想要的东西.有人能指出我正确的方向吗?
Input: Image URL
Output: List of Colors as a palette
Run Code Online (Sandbox Code Playgroud)