小编use*_*286的帖子

是否可以使用read_csv只读取特定行?

我有一个看起来像这样的csv文件:

TEST  
2012-05-01 00:00:00.203 ON 1  
2012-05-01 00:00:11.203 OFF 0  
2012-05-01 00:00:22.203 ON 1  
2012-05-01 00:00:33.203 OFF 0  
2012-05-01 00:00:44.203 OFF 0  
TEST  
2012-05-02 00:00:00.203 OFF 0  
2012-05-02 00:00:11.203 OFF 0  
2012-05-02 00:00:22.203 OFF 0  
2012-05-02 00:00:33.203 OFF 0  
2012-05-02 00:00:44.203 ON 1  
2012-05-02 00:00:55.203 OFF 0  
Run Code Online (Sandbox Code Playgroud)

并且无法摆脱"TEST"字符串.

是否可以检查一行是否以日期开头并且只读那些行?

python csv pandas

7
推荐指数
1
解决办法
7375
查看次数

Python Pandas:创建日期时间索引的最快方法是什么?

我的数据如下:

TEST
2012-05-01 00:00:00.203 OFF 0
2012-05-01 00:00:11.203 OFF 0
2012-05-01 00:00:22.203 ON 1
2012-05-01 00:00:33.203 ON 1
2012-05-01 00:00:44.203 OFF 0
TEST
2012-05-02 00:00:00.203 OFF 0
2012-05-02 00:00:11.203 OFF 0
2012-05-02 00:00:22.203 OFF 0
2012-05-02 00:00:33.203 ON 1
2012-05-02 00:00:44.203 ON 1
2012-05-02 00:00:55.203 OFF 0
Run Code Online (Sandbox Code Playgroud)

我正在使用pandas read_table读取预解析的字符串(摆脱"TEST"行),如下所示:

df = pandas.read_table(buf, sep=' ', header=None, parse_dates=[[0, 1]], date_parser=dateParser, index_col=[0])
Run Code Online (Sandbox Code Playgroud)

到目前为止,我已经尝试了几个日期解析器,未注释的日期解析器是最快的.

def dateParser(s):
#return datetime.strptime(s, "%Y-%m-%d %H:%M:%S.%f")
return datetime(int(s[0:4]), int(s[5:7]), int(s[8:10]), int(s[11:13]), int(s[14:16]), int(s[17:19]), int(s[20:23])*1000)
#return np.datetime64(s)
#return pandas.Timestamp(s, …
Run Code Online (Sandbox Code Playgroud)

python performance datetime parsing pandas

4
推荐指数
1
解决办法
6264
查看次数

标签 统计

pandas ×2

python ×2

csv ×1

datetime ×1

parsing ×1

performance ×1