将时间舍入到最接近的秒数-Python

Question

将时间舍入到最接近的秒数-Python

Jet*_*man 9 python datetime python-2.7 pandas

我有一个大型数据集，其中包含超过500 000个日期和时间戳，如下所示：

date        time
2017-06-25 00:31:53.993
2017-06-25 00:32:31.224
2017-06-25 00:33:11.223
2017-06-25 00:33:53.876
2017-06-25 00:34:31.219
2017-06-25 00:35:12.634

Run Code Online (Sandbox Code Playgroud)

如何将这些时间戳取整到最接近的秒数？

我的代码如下所示：

readcsv = pd.read_csv(filename)
log_date = readcsv.date
log_time = readcsv.time

readcsv['date'] = pd.to_datetime(readcsv['date']).dt.date
readcsv['time'] = pd.to_datetime(readcsv['time']).dt.time
timestamp = [datetime.datetime.combine(log_date[i],log_time[i]) for i in range(len(log_date))]

Run Code Online (Sandbox Code Playgroud)

所以现在我将日期和时间组合成一个datetime.datetime看起来像这样的对象列表：

datetime.datetime(2017,6,25,00,31,53,993000)
datetime.datetime(2017,6,25,00,32,31,224000)
datetime.datetime(2017,6,25,00,33,11,223000)
datetime.datetime(2017,6,25,00,33,53,876000)
datetime.datetime(2017,6,25,00,34,31,219000)
datetime.datetime(2017,6,25,00,35,12,634000)

Run Code Online (Sandbox Code Playgroud)

我从这里去哪里？该df.timestamp.dt.round('1s')功能似乎不起作用？另外在使用时，.split()我遇到了问题，即秒和分钟超过59

非常感谢

Answer 1

mik*_*ent 11

这个问题没有说明你想如何舍入。向下舍入通常适用于时间函数。这不是统计。

rounded_down_datetime = raw_datetime.replace(microsecond=0)

Run Code Online (Sandbox Code Playgroud)

Answer 2

ele*_*vir 10

没有任何额外的程序包，可以使用以下简单函数将datetime对象舍入到最接近的秒数：

import datetime

def roundSeconds(dateTimeObject):
    newDateTime = dateTimeObject

    if newDateTime.microsecond >= 500000:
        newDateTime = newDateTime + datetime.timedelta(seconds=1)

    return newDateTime.replace(microsecond=0)

Run Code Online (Sandbox Code Playgroud)

这里的小改进：为了满足 python 的命名约定，请将“roundSeconds”重命名为“round_seconds”，并相应地所有其他驼峰命名法名称。除此之外，很好的答案！ (3认同)

Answer 3

cs9*_*s95 8

如果您使用的是熊猫，则可以使用以下命令round将数据精确到最接近的秒数dt.round-

df

                timestamp
0 2017-06-25 00:31:53.993
1 2017-06-25 00:32:31.224
2 2017-06-25 00:33:11.223
3 2017-06-25 00:33:53.876
4 2017-06-25 00:34:31.219
5 2017-06-25 00:35:12.634

df.timestamp.dt.round('1s')

0   2017-06-25 00:31:54
1   2017-06-25 00:32:31
2   2017-06-25 00:33:11
3   2017-06-25 00:33:54
4   2017-06-25 00:34:31
5   2017-06-25 00:35:13
Name: timestamp, dtype: datetime64[ns]

Run Code Online (Sandbox Code Playgroud)

如果timestamp不是datetime列，请先转换它，使用pd.to_datetime-

df.timestamp = pd.to_datetime(df.timestamp)

Run Code Online (Sandbox Code Playgroud)

那么，dt.round应该工作。

Answer 4

小智 6

如果有人想将单个日期时间项舍入到最近的秒数，那么这个就可以了：

pandas.to_datetime(your_datetime_item).round('1s')

Run Code Online (Sandbox Code Playgroud)

Answer 5

ger*_*rdw 5

@electricvir 解决方案的替代版本：

import datetime

def roundSeconds(dateTimeObject):
    newDateTime = dateTimeObject + datetime.timedelta(seconds=.5)
    return newDateTime.replace(microsecond=0)

Run Code Online (Sandbox Code Playgroud)

Answer 6

Sri*_*ila 2

使用for loop和str.split()：

dts = ['2017-06-25 00:31:53.993',
       '2017-06-25 00:32:31.224',
       '2017-06-25 00:33:11.223',
       '2017-06-25 00:33:53.876',
       '2017-06-25 00:34:31.219',
       '2017-06-25 00:35:12.634']

for item in dts:
    date = item.split()[0]
    h, m, s = [item.split()[1].split(':')[0],
               item.split()[1].split(':')[1],
               str(round(float(item.split()[1].split(':')[-1])))]

    print(date + ' ' + h + ':' + m + ':' + s)

2017-06-25 00:31:54
2017-06-25 00:32:31
2017-06-25 00:33:11
2017-06-25 00:33:54
2017-06-25 00:34:31
2017-06-25 00:35:13
>>>

Run Code Online (Sandbox Code Playgroud)

你可以把它变成一个函数：

def round_seconds(dts):
    result = []
    for item in dts:
        date = item.split()[0]
        h, m, s = [item.split()[1].split(':')[0],
                   item.split()[1].split(':')[1],
                   str(round(float(item.split()[1].split(':')[-1])))]
        result.append(date + ' ' + h + ':' + m + ':' + s)

    return result

Run Code Online (Sandbox Code Playgroud)

测试功能：

dts = ['2017-06-25 00:31:53.993',
       '2017-06-25 00:32:31.224',
       '2017-06-25 00:33:11.223',
       '2017-06-25 00:33:53.876',
       '2017-06-25 00:34:31.219',
       '2017-06-25 00:35:12.634']

from pprint import pprint

pprint(round_seconds(dts))

['2017-06-25 00:31:54',
 '2017-06-25 00:32:31',
 '2017-06-25 00:33:11',
 '2017-06-25 00:33:54',
 '2017-06-25 00:34:31',
 '2017-06-25 00:35:13']
>>>

Run Code Online (Sandbox Code Playgroud)

由于您似乎使用的是 Python 2.7，要删除任何尾随零，您可能需要更改：

str(round(float(item.split()[1].split(':')[-1])))

到

str(round(float(item.split()[1].split(':')[-1]))).rstrip('0').rstrip('.')

我刚刚在repl.it上使用 Python 2.7 尝试了该函数，它按预期运行。

我相信这对于边缘情况“2017-06-25 00:31:59.993”会失败。 (2认同)

归档时间：	8 年，2 月前
查看次数：	6085 次
最近记录：	6 年，3 月前