我的csv如下(MQM Q.csv):
Date-Time,Value,Grade,Approval,Interpolation Code
31/08/2012 12:15:00,,41,1,1
31/08/2012 12:30:00,,41,1,1
31/08/2012 12:45:00,,41,1,1
31/08/2012 13:00:00,,41,1,1
31/08/2012 13:15:00,,41,1,1
31/08/2012 13:30:00,,41,1,1
31/08/2012 13:45:00,,41,1,1
31/08/2012 14:00:00,,41,1,1
31/08/2012 14:15:00,,41,1,1
Run Code Online (Sandbox Code Playgroud)
前几行没有"值"条目,但它们稍后开始.
这是我的代码:
import pandas as pd
from StringIO import StringIO
Q = pd.read_csv(StringIO("""/cygdrive/c/temp/MQM Q.csv"""), header=0, usecols=["Date-Time", "Value"], parse_dates=True, dayfirst=True, index_col=0)
Run Code Online (Sandbox Code Playgroud)
我收到以下错误:
Traceback (most recent call last):
File "daily.py", line 4, in <module>
Q = pd.read_csv(StringIO("""/cygdrive/c/temp/MQM Q.csv"""), header=0, usecols=["Date-Time", "Value"], parse_dates=True, dayfirst=True, index_col=0)
File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 443, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 228, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 533, in __init__
self._make_engine(self.engine)
File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 670, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/usr/lib/python2.7/site-packages/pandas-0.14.0-py2.7-cygwin-1.7.30-x86_64.egg/pandas/io/parsers.py", line 1067, in __init__
col_indices.append(self.names.index(u))
ValueError: 'Value' is not in list
Run Code Online (Sandbox Code Playgroud)
这似乎是 csv 解析器的一个错误,首先这是有效的:
df = pd.read_csv('MQM Q.csv')
Run Code Online (Sandbox Code Playgroud)
这也有效:
df = pd.read_csv('MQM Q.csv', usecols=['Value'])
Run Code Online (Sandbox Code Playgroud)
但如果我想要的Date-Time话,它会失败并显示与您相同的错误消息。
所以我注意到它是 utf-8 编码的,所以我使用 notepad++ 转换为 ANSI 并且它有效,然后我尝试了不带 BOM 的 utf-8 并且它也有效。
然后我将其转换为 utf-8(大概现在有一个 BOM),但它失败了,并出现与以前相同的错误,所以我认为您现在没有对此进行成像,这看起来像是一个错误。
我正在使用 python 3.3、pandas 0.14 和 numpy 1.8.1
要解决这个问题,请执行以下操作:
df = pd.read_csv('MQM Q.csv', usecols=[0,1], parse_dates=True, dayfirst=True, index_col=0)
Run Code Online (Sandbox Code Playgroud)
这会将您的索引设置为日期时间列,该列将正确转换为日期时间索引。
In [40]:
df.index
Out[40]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2012-08-31 12:15:00, ..., 2013-11-28 10:45:00]
Length: 43577, Freq: None, Timezone: None
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
24632 次 |
| 最近记录: |