Tha*_*hav 8 python csv numpy genfromtxt
我正在尝试处理保存到CSV的数据,这些数据可能在未知数量的列中丢失了值(最多约30个).我试图使用genfromtxt's filling_missing参数将这些缺失值设置为'0' .这是在Win 7上运行ActiveState ActivePython 2.7 32位的numpy 1.6.2的最小工作示例.
import numpy
text = "a,b,c,d\n1,2,3,4\n5,,7,8"
a = numpy.genfromtxt('test.txt',delimiter=',',names=True)
b = open('test.txt','w')
b.write(text)
b.close()
a = numpy.genfromtxt('test.txt',delimiter=',',names=True)
print "plain",a
a = numpy.genfromtxt('test.txt',delimiter=',',names=True,filling_values=0)
print "filling_values=0",a
a = numpy.genfromtxt('test.txt',delimiter=',',names=True,filling_values={1:0})
print "filling_values={1:0}",a
a = numpy.genfromtxt('test.txt',delimiter=',',names=True,filling_values={0:0})
print "filling_values={0:0}",a
a = numpy.genfromtxt('test.txt',delimiter=',',names=True,filling_values={None:0})
print "filling_values={None:0}",a
Run Code Online (Sandbox Code Playgroud)
结果如下:
plain [(1.0, 2.0, 3.0, 4.0) (5.0, nan, 7.0, 8.0)]
filling_values=0 [(1.0, 2.0, 3.0, 4.0) (5.0, nan, 7.0, 8.0)]
filling_values={1:0} [(1.0, 2.0, 3.0, 4.0) (5.0, 0.0, 7.0, 8.0)]
filling_values={0:0} [(1.0, 2.0, 3.0, 4.0) (5.0, nan, 7.0, 8.0)]
Traceback (most recent call last):
File "C:\Users\tolivo.EE\Documents\active\eng\python\sizer\testGenfromtxt.py", line 20, in <module>
a = numpy.genfromtxt('test.txt',delimiter=',',names=True,filling_values={None:0})
File "C:\Users\tolivo.EE\AppData\Roaming\Python\Python27\site-packages\numpy\lib\npyio.py", line 1451, in genfromtxt
filling_values[key] = val
TypeError: list indices must be integers, not NoneType
Run Code Online (Sandbox Code Playgroud)
从NumPy用户指南我期望filling_values=0并且filling_values={None:0}工作,但他们没有,并分别抛出错误.当你指定正确的列(filling_values={1:0})时它会起作用,但由于我在用户选择之前有大量未知数的列,我正在寻找自动设置填充值的方法,就像用户指南提示的那样.
我想我可以提前计算列并创建一个dict作为值同时传递给filling_values,但是有更好的方法吗?
从文档中可以看出并不明显,但filling_values="0"有效.
In [19]: !cat test.txt
a,b,c,d
1,2,3,4
5,,7,8
9,10,,12
In [20]: a = numpy.genfromtxt('test.txt', delimiter=',', names=True, filling_values="0")
In [21]: print a
[(1.0, 2.0, 3.0, 4.0) (5.0, 0.0, 7.0, 8.0) (9.0, 10.0, 0.0, 12.0)]
Run Code Online (Sandbox Code Playgroud)