如何计算日志文件条目中每小时的访问次数?

Luc*_*Lin 5 python logging parsing python-3.x

我有一个日志文件,其中每行包含IP地址,访问时间和访问的URL.我想计算每小时的访问次数.

访问时间数据看起来像这样

[01/Jan/2017:14:15:45 +1000]
[01/Jan/2017:14:15:45 +1000]
[01/Jan/2017:15:16:05 +1000]
[01/Jan/2017:16:16:05 +1000] 
Run Code Online (Sandbox Code Playgroud)

我怎样才能改进它,所以我不需要为每小时设置变量和if语句?

twoPM = 0
thrPM = 0
fouPM = 0
timeStamp = line.split('[')[1].split(']')[0]
formated_timeStamp = datetime.datetime.strptime(timeStamp,'%d/%b/%Y:%H:%M:%S %z').strftime('%H')
if formated_timeStamp == '14':
    twoPM +=1
if formated_timeStamp == '15':
    thrPM +=1
if formated_timeStamp == '16':
    fouPM +=1
Run Code Online (Sandbox Code Playgroud)

409*_*ict 3

    \n
  1. 您可以将括号包含在strptime格式描述中:

    \n\n
    datetime.datetime.strptime(line.strip(),\'[%d/%b/%Y:%H:%M:%S %z]\')\n
    Run Code Online (Sandbox Code Playgroud)
  2. \n
  3. .hour您可以使用任何对象的属性提取小时datetime.datetime

    \n\n
    timestamp = datetime.datetime.strptime(\xe2\x80\xa6)\nhour = timestamp.hour\n
    Run Code Online (Sandbox Code Playgroud)
  4. \n
  5. 您可以使用以下方法计算元素数量collections.Counter

    \n\n
    from collections import Counter\n\n\ndef read_logs(filename):\n    with open(filename) as log_file:\n         for line in log_file:\n             timestamp = datetime.datetime.strptime(\n                     line.strip(),\n                     \'[%d/%b/%Y:%H:%M:%S %z]\')\n             yield timestamp.hour\n\n\ndef count_access(log_filename):\n    return Counter(read_logs(log_filename))\n\n\nif __name__ == \'__main__\':\n    print(count_access(\'/path/to/logs/\'))\n
    Run Code Online (Sandbox Code Playgroud)
  6. \n
\n