如何在时间序列线图上绘制回归线

Raj*_*air 4 python time-series matplotlib pandas scikit-learn

我对下面计算的斜率值(以度为单位)有疑问:

import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
import datetime as dt
import numpy as np

df = yf.download('aapl', '2015-01-01', '2021-01-01')
df.rename(columns = {'Adj Close' : 'Adj_close'}, inplace= True)

x1 = pd.Timestamp('2019-01-02')
x2 = df.index[-1]
y1 = df[df.index == x1].Adj_close[0]
y2 = df[df.index == x2].Adj_close[0]

slope = (y2 - y1)/ (x2 - x1).days
angle = round(np.rad2deg(np.arctan2(y2 - y1, (x2 - x1).days)), 1)

fig, ax1 = plt.subplots(figsize= (15, 6))
ax1.grid(True, linestyle= ':')
ax1.set_zorder(1)
ax1.set_frame_on(False)
ax1.plot(df.index, df.Adj_close, c= 'k', lw= 0.8)
ax1.plot([x1, x2], [y1, y2], c= 'k')


ax1.set_xlim(df.index[0], df.index[-1])
plt.show()
Run Code Online (Sandbox Code Playgroud)

它返回的斜坡角度值为 7.3 度。从图表来看,这看起来并不正确: 在此输入图像描述

看起来接近45度。这里有什么问题吗?

这是我需要计算角度的线: 在此输入图像描述

Tre*_*ney 9

  • OP 中的实现不是确定或绘制线性模型的正确方法。因此,关于确定绘制线的角度的问题被绕过,并且显示了绘制回归线的更严格的方法。
  • 可以通过将日期时间日期转换为序数来添加回归线。可以使用 计算模型sklearn,或使用 将该模型添加到图中seaborn.regplot,如下所示。
  • 绘制完整数据pandas.DataFrame.plot
  • 测试于python 3.8.11, pandas 1.3.2, matplotlib 3.4.3, seaborn 0.11.2,sklearn 0.24.2

进口和数据

import yfinance as yf
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LinearRegression

# download the data
df = yf.download('aapl', '2015-01-01', '2021-01-01')

# convert the datetime index to ordinal values, which can be used to plot a regression line
df.index = df.index.map(pd.Timestamp.toordinal)

# display(df.iloc[:5, [4]])
        Adj Close
Date             
735600  24.782110
735603  24.083958
735604  24.086227
735605  24.423975
735606  25.362394

# convert the regression line start date to ordinal
x1 = pd.to_datetime('2019-01-02').toordinal()

# data slice for the regression line
data=df.loc[x1:].reset_index()
Run Code Online (Sandbox Code Playgroud)

使用seaborn绘制回归线

  • 不需要seaborn.regplot任何计算即可将回归线添加到数据的线图中。
  • 将 x 轴标签转换为日期时间格式
  • 如果您需要调整端点,请使用 xticks 和标签。
# plot the Adj Close data
ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
              title='Adjusted Close with Regression Line from 2019-01-02')

# add a regression line
sns.regplot(data=data, x='Date', y='Adj Close', ax=ax1, color='magenta', scatter_kws={'s': 7}, label='Linear Model', scatter=False)

ax1.set_xlim(df.index[0], df.index[-1])

# convert the axis back to datetime
xticks = ax1.get_xticks()
labels = [pd.Timestamp.fromordinal(int(label)).date() for label in xticks]
ax1.set_xticks(xticks)
ax1.set_xticklabels(labels)

ax1.legend()

plt.show()
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

计算线性模型

# plot the Adj Close data
ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
              title='Adjusted Close with Regression Line from 2019-01-02')

# add a regression line
sns.regplot(data=data, x='Date', y='Adj Close', ax=ax1, color='magenta', scatter_kws={'s': 7}, label='Linear Model', scatter=False)

ax1.set_xlim(df.index[0], df.index[-1])

# convert the axis back to datetime
xticks = ax1.get_xticks()
labels = [pd.Timestamp.fromordinal(int(label)).date() for label in xticks]
ax1.set_xticks(xticks)
ax1.set_xticklabels(labels)

ax1.legend()

plt.show()
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

斜坡角度

  • 这是 方面的产物axes,它不等于xy。当坡向相等时,坡度为 7.0 度。
# create the model
model = LinearRegression()

# extract x and y from dataframe data
x = data[['Date']]
y = data[['Adj Close']]

# fit the mode
model.fit(x, y)

# print the slope and intercept if desired
print('intercept:', model.intercept_)
print('slope:', model.coef_)

intercept: [-90078.45713565]
slope: [[0.1222514]]

# calculate y1, given x1
y1 = model.predict(np.array([[x1]]))

print(y1)
array([[28.27904095]])

# calculate y2, given the last date in data
x2 = data.Date.iloc[-1]
y2 = model.predict(np.array([[x2]]))

print(y2)
array([[117.40030862]])

# this can be added to `ax1` with
ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
              title='Adjusted Close with Regression Line from 2019-01-02')
ax1.plot([x1, x2], [y1[0][0], y2[0][0]], label='Linear Model', c='magenta')
ax1.legend()
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述