Raj*_*air 4 python time-series matplotlib pandas scikit-learn
我对下面计算的斜率值(以度为单位)有疑问:
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
import datetime as dt
import numpy as np
df = yf.download('aapl', '2015-01-01', '2021-01-01')
df.rename(columns = {'Adj Close' : 'Adj_close'}, inplace= True)
x1 = pd.Timestamp('2019-01-02')
x2 = df.index[-1]
y1 = df[df.index == x1].Adj_close[0]
y2 = df[df.index == x2].Adj_close[0]
slope = (y2 - y1)/ (x2 - x1).days
angle = round(np.rad2deg(np.arctan2(y2 - y1, (x2 - x1).days)), 1)
fig, ax1 = plt.subplots(figsize= (15, 6))
ax1.grid(True, linestyle= ':')
ax1.set_zorder(1)
ax1.set_frame_on(False)
ax1.plot(df.index, df.Adj_close, c= 'k', lw= 0.8)
ax1.plot([x1, x2], [y1, y2], c= 'k')
ax1.set_xlim(df.index[0], df.index[-1])
plt.show()
Run Code Online (Sandbox Code Playgroud)
它返回的斜坡角度值为 7.3 度。从图表来看,这看起来并不正确:

看起来接近45度。这里有什么问题吗?
sklearn,或使用 将该模型添加到图中seaborn.regplot,如下所示。pandas.DataFrame.plotpython 3.8.11, pandas 1.3.2, matplotlib 3.4.3, seaborn 0.11.2,sklearn 0.24.2import yfinance as yf
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
from sklearn.linear_model import LinearRegression
# download the data
df = yf.download('aapl', '2015-01-01', '2021-01-01')
# convert the datetime index to ordinal values, which can be used to plot a regression line
df.index = df.index.map(pd.Timestamp.toordinal)
# display(df.iloc[:5, [4]])
Adj Close
Date
735600 24.782110
735603 24.083958
735604 24.086227
735605 24.423975
735606 25.362394
# convert the regression line start date to ordinal
x1 = pd.to_datetime('2019-01-02').toordinal()
# data slice for the regression line
data=df.loc[x1:].reset_index()
Run Code Online (Sandbox Code Playgroud)
seaborn.regplot任何计算即可将回归线添加到数据的线图中。# plot the Adj Close data
ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
title='Adjusted Close with Regression Line from 2019-01-02')
# add a regression line
sns.regplot(data=data, x='Date', y='Adj Close', ax=ax1, color='magenta', scatter_kws={'s': 7}, label='Linear Model', scatter=False)
ax1.set_xlim(df.index[0], df.index[-1])
# convert the axis back to datetime
xticks = ax1.get_xticks()
labels = [pd.Timestamp.fromordinal(int(label)).date() for label in xticks]
ax1.set_xticks(xticks)
ax1.set_xticklabels(labels)
ax1.legend()
plt.show()
Run Code Online (Sandbox Code Playgroud)
sklearn.linear_model.LinearRegression从线性模型计算任何所需的点,然后用matplotlib.pyplot.ploty1和来延伸线y2,给定x1和x2。# plot the Adj Close data
ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
title='Adjusted Close with Regression Line from 2019-01-02')
# add a regression line
sns.regplot(data=data, x='Date', y='Adj Close', ax=ax1, color='magenta', scatter_kws={'s': 7}, label='Linear Model', scatter=False)
ax1.set_xlim(df.index[0], df.index[-1])
# convert the axis back to datetime
xticks = ax1.get_xticks()
labels = [pd.Timestamp.fromordinal(int(label)).date() for label in xticks]
ax1.set_xticks(xticks)
ax1.set_xticklabels(labels)
ax1.legend()
plt.show()
Run Code Online (Sandbox Code Playgroud)
axes,它不等于x和y。当坡向相等时,坡度为 7.0 度。# create the model
model = LinearRegression()
# extract x and y from dataframe data
x = data[['Date']]
y = data[['Adj Close']]
# fit the mode
model.fit(x, y)
# print the slope and intercept if desired
print('intercept:', model.intercept_)
print('slope:', model.coef_)
intercept: [-90078.45713565]
slope: [[0.1222514]]
# calculate y1, given x1
y1 = model.predict(np.array([[x1]]))
print(y1)
array([[28.27904095]])
# calculate y2, given the last date in data
x2 = data.Date.iloc[-1]
y2 = model.predict(np.array([[x2]]))
print(y2)
array([[117.40030862]])
# this can be added to `ax1` with
ax1 = df.plot(y='Adj Close', c='k', figsize=(15, 6), grid=True, legend=False,
title='Adjusted Close with Regression Line from 2019-01-02')
ax1.plot([x1, x2], [y1[0][0], y2[0][0]], label='Linear Model', c='magenta')
ax1.legend()
Run Code Online (Sandbox Code Playgroud)