Luc*_*Lin 5 python logging parsing python-3.x
我有一个日志文件,其中每行包含IP地址,访问时间和访问的URL.我想计算每小时的访问次数.
访问时间数据看起来像这样
[01/Jan/2017:14:15:45 +1000]
[01/Jan/2017:14:15:45 +1000]
[01/Jan/2017:15:16:05 +1000]
[01/Jan/2017:16:16:05 +1000]
Run Code Online (Sandbox Code Playgroud)
我怎样才能改进它,所以我不需要为每小时设置变量和if语句?
twoPM = 0
thrPM = 0
fouPM = 0
timeStamp = line.split('[')[1].split(']')[0]
formated_timeStamp = datetime.datetime.strptime(timeStamp,'%d/%b/%Y:%H:%M:%S %z').strftime('%H')
if formated_timeStamp == '14':
twoPM +=1
if formated_timeStamp == '15':
thrPM +=1
if formated_timeStamp == '16':
fouPM +=1
Run Code Online (Sandbox Code Playgroud)
您可以将括号包含在strptime格式描述中:
datetime.datetime.strptime(line.strip(),\'[%d/%b/%Y:%H:%M:%S %z]\')\nRun Code Online (Sandbox Code Playgroud).hour您可以使用任何对象的属性提取小时datetime.datetime:
timestamp = datetime.datetime.strptime(\xe2\x80\xa6)\nhour = timestamp.hour\nRun Code Online (Sandbox Code Playgroud)您可以使用以下方法计算元素数量collections.Counter:
from collections import Counter\n\n\ndef read_logs(filename):\n with open(filename) as log_file:\n for line in log_file:\n timestamp = datetime.datetime.strptime(\n line.strip(),\n \'[%d/%b/%Y:%H:%M:%S %z]\')\n yield timestamp.hour\n\n\ndef count_access(log_filename):\n return Counter(read_logs(log_filename))\n\n\nif __name__ == \'__main__\':\n print(count_access(\'/path/to/logs/\'))\nRun Code Online (Sandbox Code Playgroud)| 归档时间: |
|
| 查看次数: |
584 次 |
| 最近记录: |