使用Pandas的每小时日期时间直方图

Question

使用Pandas的每小时日期时间直方图

假设我有一个时间戳列datetime的pandas.DataFrame.例如,时间戳以秒为单位分辨率.我想在10分钟[1]水桶/垃圾箱中装箱/垃圾箱.我知道我可以datetime将整数时间戳表示为整数,然后使用直方图.有更简单的方法吗？内置的东西pandas？

[1] 10分钟只是一个例子.最终,我想使用不同的分辨率.

Answer 1

Rom*_*ain 20

要使用像"10Min"这样的自定义频率,您必须使用TimeGrouper- 如@johnchase所建议的 - 对其进行操作index.

# Generating a sample of 10000 timestamps and selecting 500 to randomize them
df = pd.DataFrame(np.random.choice(pd.date_range(start=pd.to_datetime('2015-01-14'),periods = 10000, freq='S'), 500),  columns=['date'])
# Setting the date as the index since the TimeGrouper works on Index, the date column is not dropped to be able to count
df.set_index('date', drop=False, inplace=True)
# Getting the histogram
df.groupby(pd.TimeGrouper(freq='10Min')).count().plot(kind='bar')

Run Code Online (Sandbox Code Playgroud)

运用 `to_period`

也可以使用这种to_period方法,但据我所知,它不适用于像"10Min"这样的自定义时期.此示例使用其他列来模拟项目的类别.

# The number of sample
nb_sample = 500
# Generating a sample and selecting a subset to randomize them
df = pd.DataFrame({'date': np.random.choice(pd.date_range(start=pd.to_datetime('2015-01-14'),periods = nb_sample*30, freq='S'), nb_sample),
                  'type': np.random.choice(['foo','bar','xxx'],nb_sample)})

# Grouping per hour and type
df = df.groupby([df['date'].dt.to_period('H'), 'type']).count().unstack()
# Droping unnecessary column level
df.columns = df.columns.droplevel()
df.plot(kind='bar')

Run Code Online (Sandbox Code Playgroud)

归档时间：	9 年，10 月前
查看次数：	12509 次
最近记录：	9 年，10 月前

使用Pandas的每小时日期时间直方图

运用 to_period

运用 `to_period`