use*_*375 3 python numpy genfromtxt
我不明白为什么numpy.genfromtxt不能正确分割下面的字符串,delimiter=","而它适用于我的块中的大多数其他字符串.
chunk[12968]
Out[143]: '2901869281,3279442095,2012-12-15T23:00:00.003Z,Sacramento,CA,R#3817874,United States,38.583,-121.498,11, 8, 6, 5, 1, 0, 2, 3, 3, 5, 3, 3, 2, 2, 6, 6, 1, 2, 3, 0, 1, 1, 0, 0, 2, 2, 2, 2, 1, 0, 0, 2, 1, 0, 1, 1, 2, 0, 3, 1, 1, 1, 1, 0, 0, 4, 0, 0, 0, 1, 3, 1, 0, 2, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 2, 0, 9, 0, 0, 0, 2, 3, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0,130\n'
Run Code Online (Sandbox Code Playgroud)
我期待一个形状的数组(110,),但得到以下
genfromtxt([chunk[12968]],delimiter=",",dtype=np.int64)
Out[142]:
array([2901869281, 3279442095, -1, -1, -1,
-1], dtype=int64)
Run Code Online (Sandbox Code Playgroud)
请注意,我使用izip_longestfrom itertools以这种方式读取大块*csv:
with open('events.csv','r') as:
for chunk in izip_longest(*[f] *50000):
...
Run Code Online (Sandbox Code Playgroud)
感谢帮助.
默认为的comments参数,因此输入中的所有内容都会被忽略:genfromtxt()'#'#
2901869281,3279442095,2012-12-15T23:00:00.003Z,Sacramento,CA,R#3817874,United States,...
^ start of comment
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
393 次 |
| 最近记录: |