Jia*_* Li 4 python matplotlib datetimeoffset pandas
我想找到一种方法,pandas.tseries.offsets以1秒的频率为交易时间构建一个自定义类.这里的主要要求是时间偏移对象足够聪明,知道"2015-06-18 16:00:00"的下一秒将是'2015-06-19 09:30:00或09:30: 01',从这两个时间戳计算的时间增量将精确为1秒(自定义偏移量1s,类似于BDay(1)工作日频率),而不是关闭时间的持续时间.
原因是当在几个交易日内绘制pd.Series的日内数据时,请看下面的模拟示例,在收盘价和次日开盘价之间有很多"阶梯线"(线性插值)来表示持续时间.关闭时间.有没有办法摆脱这个?我查看源代码pandas.tseries.offsets并查找pd.tseries.offsets.BusinessHour并pd.tseries.offsets.BusinessMixin可能有所帮助,但我不知道如何使用它们.
import pandas as pd
import numpy as np
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
# set as 'constant' object shared by all codes in this script
BDAY_US = CustomBusinessDay(calender=USFederalHolidayCalendar())
sample_freq = '5min'
dates = pd.date_range(start='2015-01-01', end='2015-01-31', freq=BDAY_US).date
# exculde the 09:30:00 as it is included in the first time bucket
times = pd.date_range(start='09:30:00', end='16:00:00', freq=sample_freq).time[1:]
time_stamps = [dt.datetime.combine(date, time) for date in dates for time in times]
s = pd.Series(np.random.randn(len(time_stamps)).cumsum() + 100, index=time_stamps)
s.plot()
Run Code Online (Sandbox Code Playgroud)

我能想到的另一种部分解决此问题的方法是首先reset_index()获取每一行的默认连续整数索引,然后计算连续整数索引之间的差异,作为经过的时间(以秒为单位).将整数索引绘制为x轴,然后将它们重新标记为适当的时间标签.有人可以告诉我如何做到这matplotlib一点吗?
感谢杰夫的评论.我只是检查在线文档,BusinessHour()并发现它可能对我的情况有用.另一个后续问题:BusinessHour小时频率,有没有办法让它以1s的频率进行?还有,如何将它与CustomBusinessDay物体结合?
使用 BusinessHour()
from pandas.tseries.offsets import *
bhour = BusinessHour(start='09:30', end='16:00')
time = pd.Timestamp('2015-06-18 15:00:00')
print(time)
2015-06-18 15:00:00
# hourly increment works nicely
print(time + bhour * 1)
2015-06-19 09:30:00
# but not at minute or second frequency
print(time + Minute(61))
2015-06-18 16:01:00
print(time + Second(60*60 + 1))
2015-06-18 16:00:01
Run Code Online (Sandbox Code Playgroud)
非常感谢,任何帮助都将受到高度赞赏.
正如我在评论中提到的,你可能有两个不同的问题
我给出的解决方案将占1,因为这似乎是你的直接问题.如果您需要2个或两个 - 请在评论中告诉我们:
大多数图形都matplotlib可以通过tickerAPI将索引格式化程序应用于轴.我会根据你的情况调整这个例子
import pandas as pd
import numpy as np
from pandas.tseries.holiday import USFederalHolidayCalendar
from pandas.tseries.offsets import CustomBusinessDay
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
# set as 'constant' object shared by all codes in this script
BDAY_US = CustomBusinessDay(calender=USFederalHolidayCalendar())
sample_freq = '5min'
dates = pd.date_range(start='2015-01-01', end='2015-01-31', freq=BDAY_US).date
# exculde the 09:30:00 as it is included in the first time bucket
times = pd.date_range(start='09:30:00', end='16:00:00', freq=sample_freq).time[1:]
time_stamps = [dt.datetime.combine(date, time) for date in dates for time in times]
s = pd.Series(np.random.randn(len(time_stamps)).cumsum() + 100, index=time_stamps)
data_length = len(s)
s.index.name = 'date_time_index'
s.name='stock_price'
s_new = s.reset_index()
ax = s_new.plot(y='stock_price') #plot the data against the new linearised index...
def format_date(x,pos=None):
thisind = np.clip(int(x+0.5), 0, data_length-1)
return s_new.date_time_index[thisind].strftime('%Y-%m-%d %H:%M:%S')
ax.xaxis.set_major_formatter(ticker.FuncFormatter(format_date))
fig = plt.gcf()
fig.autofmt_xdate()
plt.show()
Run Code Online (Sandbox Code Playgroud)
这样输出如下,首先缩小自然比例,第二个放大,这样你就可以看到星期五16:00到星期一09:00之间的过渡

