我正在尝试消除这个时间序列图中的周末间隙。x 轴是数据时间戳。我已经尝试过该网站上的代码,但无法使其工作。查看使用的示例文件
数据看起来像这样
+-----------------------+---------------------+-------------+-------------+
| asof | INSERTED_TIME | DATA_SOURCE | PRICE |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:00:15 | DB | 170.4261757 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:06:10 | DB | 168.9348656 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:06:29 | DB | 168.8412129 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:07:27 | DB | 169.878796 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:10:28 | DB | 169.3685879 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:12:14 | DB | 169.0787045 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:12:33 | DB | 169.7561092 |
+-----------------------+---------------------+-------------+-------------+
Run Code Online (Sandbox Code Playgroud)
情节包括周末休息
使用line函数,我得到了下面的图,其中直线从周五结束到周一早上。使用 px.scatter,我没有得到这条线,但我仍然得到了间隙。
+-----------------------+---------------------+-------------+-------------+
| asof | INSERTED_TIME | DATA_SOURCE | PRICE |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:00:15 | DB | 170.4261757 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:06:10 | DB | 168.9348656 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:06:29 | DB | 168.8412129 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:07:27 | DB | 169.878796 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:10:28 | DB | 169.3685879 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:12:14 | DB | 169.0787045 |
+-----------------------+---------------------+-------------+-------------+
| 2020-06-17 00:00:00 | 2020-06-17 12:12:33 | DB | 169.7561092 |
+-----------------------+---------------------+-------------+-------------+
Run Code Online (Sandbox Code Playgroud)
尝试没有周末休息
import plotly.express as px
import pandas as pd
sampledf = pd.read_excel('sample.xlsx')
fig_sample = px.line(sampledf, x = 'INSERTED_TIME', y= 'PRICE', color = 'DATA_SOURCE')
fig_sample.show()
Run Code Online (Sandbox Code Playgroud)
使用rangebreaks会产生空白图。
任何帮助表示赞赏。谢谢
使用时有1000行的限制rangebreaks
当处理超过1000行时,添加参数render_mode='svg'
在下面的代码中,我使用了该scatter函数,但正如您所看到的,周末的大间隙不再存在。另外我排除了晚上 11 点到上午 11 点之间的时间
sampledf = pd.read_excel('sample.xlsx')
fig_sample = px.scatter(sampledf, x = 'INSERTED_TIME', y= 'PRICE', color = 'DATA_SOURCE', render_mode='svg')
fig_sample.update_xaxes(
rangebreaks=[
{ 'pattern': 'day of week', 'bounds': [6, 1]}
{ 'pattern': 'hour', 'bounds':[23,11]}
]
)
fig_sample.show()
Run Code Online (Sandbox Code Playgroud)
图中的值与原始数据集不同,但适用于原始帖子中的数据。在这里找到帮助
看起来空白图上的 x 轴甚至没有正确的范围,因为它从不同的年份开始。如果不查看确切的数据输入,很难解释这种行为,但是您可以从一个工作的、更简单的数据集开始,并尝试检查差异(尝试使用选择点绘制数据的过滤版本或检查dtypesDataFrame 等)。
您将通过更简单的数据集看到预期的行为:
import plotly.express as px
import pandas as pd
from datetime import datetime
d = {'col1': [datetime(2020, 5, d) for d in range(1, 30)],
'col2': [d if (d + 3) % 7 not in (5, 6) else 0 for d in range(1, 30)]}
df = pd.DataFrame(data=d)
df.set_index('col1')
df_weekdays = df[df['col1'].dt.dayofweek.isin([0,1,2,3,4])]
f = px.line(df, x='col1', y='col2')
f.update_xaxes(
rangebreaks=[
dict(bounds=["sat", "mon"]), #hide weekends
]
)
f.show()
Run Code Online (Sandbox Code Playgroud)
对于没有周末的 DataFrame,df_weekdays它是类似的图像:
| 归档时间: |
|
| 查看次数: |
5709 次 |
| 最近记录: |