sec*_*ind 2 python linux csv awk matplotlib
我有一个不断增长的csv文件,看起来像:
143100, 2012-05-21 09:52:54.165852
125820, 2012-05-21 09:53:54.666780
109260, 2012-05-21 09:54:55.144712
116340, 2012-05-21 09:55:55.642197
125640, 2012-05-21 09:56:56.094999
122820, 2012-05-21 09:57:56.546567
124770, 2012-05-21 09:58:57.046050
103830, 2012-05-21 09:59:57.497299
114120, 2012-05-21 10:00:58.000978
-31549410, 2012-05-21 10:01:58.063470
90390, 2012-05-21 10:02:58.108794
81690, 2012-05-21 10:03:58.161329
80940, 2012-05-21 10:04:58.227664
102180, 2012-05-21 10:05:58.289882
99750, 2012-05-21 10:06:58.322063
87000, 2012-05-21 10:07:58.391256
92160, 2012-05-21 10:08:58.442438
80130, 2012-05-21 10:09:58.506494
Run Code Online (Sandbox Code Playgroud)
当生成文件的服务具有API连接失败时,会出现负数.我已经使用matplotlib来绘制数据图形,但是人工负数会大大地压缩图形.我想找到所有否定条目并删除相应的行.在任何情况下,负数实际上都不代表任何实际数据.
在Bash我会做类似的事情:
awk '{print $1}' original.csv | sed '/-/d' > new.csv
Run Code Online (Sandbox Code Playgroud)
但是这很麻烦而且往往很慢,如果我能帮助它,我真的不想在我的python图形脚本中嵌入bash命令.
谁能指出我正确的方向?
编辑:
这是我用来读取/绘制数据的代码:
import matplotlib
matplotlib.use('Agg')
from matplotlib.mlab import csv2rec
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from pylab import *
output_image_name='tpm.png'
data = csv2rec('counter.log', names=['packets', 'time'])
rcParams['figure.figsize'] = 10, 5
rcParams['font.size'] = 8
fig = plt.figure()
plt.plot(data['packets'], data['time'])
ax = fig.add_subplot(111)
ax.plot(data['time'], data['tweets'])
hours = mdates.HourLocator()
fmt = mdates.DateFormatter('%D - %H:%M')
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(fmt)
ax.grid()
plt.ylabel("packets")
plt.title("Packet Log: Packets Per Minute")
fig.autofmt_xdate(bottom=0.2, rotation=90, ha='left')
plt.savefig(output_image_name)
Run Code Online (Sandbox Code Playgroud)
Python习惯用法是使用生成器表达式来过滤行:
sys.stdout.writelines(line for line in sys.stdin if not line.startswith('-'))
Run Code Online (Sandbox Code Playgroud)
或者在处理环境中:
filtered = (line for line in sys.stdin if not line.startswith('-'))
for line in filtered:
# ...
Run Code Online (Sandbox Code Playgroud)