将datetime列表与datetime的dict进行比较

tky*_*ass 14 python datetime dictionary

我有一个任务是根据具体情况创建日期集,例如"大于2"将被传递,我需要在本月创建一组日期> 2的所有日期.我也是一个开始时间和停止时间,例如上午10点至下午6点,在这种情况下,我将创建一组所有日期> 2,并在每天有时间从上午10点开始,结束和下午6点,下面是一个例子:

greater > 2 less < 9 
start time :10am
stop time :6 pm
month:july
date1: 2016-07-03 10:00, 2016-07-03 16:00
date2: 2016-07-04 10:00, 2016-07-04 16:00
date3: 2016-07-05 10:00, 2016-07-05 16:00
.
.
.
date6: 2016-07-8 10:00, 2016-07-8 16:00
Run Code Online (Sandbox Code Playgroud)

我决定将这些日期存储到如下字典中:

dictD = {'dates_between_2_9':[[2016-07-03 10:00, 2016-07-03 16:00], [2016-07-04 10:00, 2016-07-04 16:00], ....., [2016-07-08 10:00, 2016-07-08 16:00]]} 
Run Code Online (Sandbox Code Playgroud)

我使用了dict,因为我将需要多个条件来为它们创建日期集,所以除了dates_between_2_5之外还会有另一个键.

另一方面,我根据条件得到另一个请求,以创建具有开始时间的日期,如下所示:

greater > 1 less than 12
start time : 2pm
    date1: 2016-07-02 14:00
    date2: 2016-07-03 14:00
    date3: 2016-07-04 14:00
    .
    .
    .
    date10: 2016-07-11 14:00
Run Code Online (Sandbox Code Playgroud)

我决定将这些日期存储在一个列表中:

listL = [2016-07-02 14:00,2016-07-03 14:00,2016-07-04 14:00 ... 2016-07-11 14:00]
Run Code Online (Sandbox Code Playgroud)

之后,我将ListL中的每个日期与DictD中每个键的日期列表进行比较,如果ListL中的日期位于开始,停止时间,那么我应该从列表中删除它并仅返回ListL中没有的日期与DictD的日期重叠,我的逻辑如下:

for L from ListL:
    for every key in DictD:
        for item from DictD[key]:
            if DictD[key][0] < L < DictD[key][1] # check if item from list overlap with start,stop time from dictionary.
                ListL.remove(L) # I know I can't remove items from list while iterating so I will probably create a set and store all overlapped items and then subtract this set to set(ListL) to get the difference. 
return ListL
Run Code Online (Sandbox Code Playgroud)

我的问题是,我使用高效的数据结构来处理我的要求吗?我看到我的逻辑效率不高,所以我想知道是否有更好的方法来解决这个问题?

任何帮助将不胜感激.提前致谢!

Pet*_*ain 5

听起来你正试图优化你的算法.说实话,对于这么大的数据,它可能没有必要.但是,如果您感兴趣,一般的经验法则是,在检查成员资格时,集合比 Python中的列表更快.

在这种情况下,不清楚您的设置可能是什么.我假设您最多只有一分钟级别的粒度,但您可以通过更大的粒度(例如几小时)来降低(更多内存)或确实提高占用率和性能.此代码显示甚至相对较大的集合可以至少快5倍(并且在比较数据集时看起来更简单):

from copy import copy
from datetime import datetime, timedelta
from timeit import timeit
import time

def make_range(start, open, close, days):
    result = []
    base_start = start + open
    base_close = start + close
    while days > 0:
        result.append([base_start, base_close])
        base_start += timedelta(days=1)
        base_close += timedelta(days=1)
        days -= 1
    return result

def make_range2(start, open, close, days):
    result = set()
    base_start = start + open
    base_close = start + close
    while days > 0:
        now = base_start
        while now <= base_close:
            result.add(now)
            now += timedelta(minutes=1)
        base_start += timedelta(days=1)
        base_close += timedelta(days=1)
        days -= 1
    return result

dateRange = {
    'range1': make_range(datetime(2016, 7, 3, 0, 0),
                         timedelta(hours=10),
                         timedelta(hours=18),
                         6),
}

dateRange2 = {
    'range1': make_range2(datetime(2016, 7, 3, 0, 0),
                          timedelta(hours=10),
                          timedelta(hours=18),
                          6),
}

dateList = [
    datetime(2016, 7, 2, 14, 0),
    datetime(2016, 7, 3, 14, 0),
    datetime(2016, 7, 4, 14, 0),
    datetime(2016, 7, 5, 14, 0),
    datetime(2016, 7, 6, 14, 0),
    datetime(2016, 7, 7, 14, 0),
    datetime(2016, 7, 8, 14, 0),
    datetime(2016, 7, 9, 14, 0),
    datetime(2016, 7, 10, 14, 0),
    datetime(2016, 7, 11, 14, 0)
]

dateSet = set(dateList)

def f1():
    result = copy(dateList)
    for a in dateList:
        for b in dateRange:
            for i in dateRange[b]:
                if i[0] <= a <= i[1]:
                    result.remove(a)
    return result

def f2():
    result = copy(dateSet)
    for b in dateRange2:
        result = result.difference(dateRange2[b])
    return result

print(f1())
print(timeit("f1()", "from __main__ import f1", number=100000))

print(f2())
print(timeit("f2()", "from __main__ import f2", number=100000))
Run Code Online (Sandbox Code Playgroud)

记录结果如下:

[datetime.datetime(2016, 7, 2, 14, 0), datetime.datetime(2016, 7, 9, 14, 0), datetime.datetime(2016, 7, 10, 14, 0), datetime.datetime(2016, 7, 11, 14, 0)]
1.922587754837455

{datetime.datetime(2016, 7, 2, 14, 0), datetime.datetime(2016, 7, 9, 14, 0), datetime.datetime(2016, 7, 10, 14, 0), datetime.datetime(2016, 7, 11, 14, 0)}
0.30558400587733225
Run Code Online (Sandbox Code Playgroud)

您还可以将dict dateRange转换为列表,但只有1或2个成员,这不太可能在性能上产生任何实际差异.但是,它更具逻辑意义,因为您实际上并没有使用dict来查找任何特定的键值 - 您只是遍历所有值.


小智 1

坦白说,我不确定我是否明白你的问题是什么,我尝试过这样的事情:

for date in dateList:
    for everyrange in dateRange:
        find=False
        for i in dateRange[everyrange]:
            #print('date={date} ,key={everyrange},i={i}'.format(date=date, everyrange=everyrange,i=i))
            if i[0] <= date <= i[1]:
                print(date)
                find=True
                break
            else:
                print(0)
        if find:
            break
Run Code Online (Sandbox Code Playgroud)