部分适合和热启动有什么区别?

Sha*_*Sha 13 python machine-learning python-2.7 scikit-learn

背景:

我正在使用来自scikit库的Passive Aggressor,并混淆了是使用暖启动还是部分适合.

迄今为止的努力:

  1. 提到这个帖子讨论:

https://github.com/scikit-learn/scikit-learn/issues/1585

  1. 通过_fit_partial_fit的scikit代码.

我的观察:

  1. _fit反过来打电话_partial_fit.

  2. warm_start设置,_fit调用_partial_fitself.coef_

  3. _partial_fit没有coef_init参数调用并且self.coef_被设置时,它继续使用self.coef_

问题:

我觉得两者最终都提供了相同的功能.那么,它们之间的基本区别是什么?在哪种情况下,使用其中任何一种?

我错过了明显的东西吗?任何帮助表示赞赏!

小智 9

我不知道 Passive Aggressor,但至少在使用SGDRegressor 时partial_fit它只适用于 1 个时期,而fit适用于多个时期(直到损失收敛或max_iter 达到)。因此,安装新的数据模型时,partial_fit只会矫正迈向新的数据模型一步到位,但fitwarm_start它会表现的好像你会结合您的旧数据和新的数据一起,拟合模型一次,直到收敛。

例子:

from sklearn.linear_model import SGDRegressor
import numpy as np

np.random.seed(0)
X = np.linspace(-1, 1, num=50).reshape(-1, 1)
Y = (X * 1.5 + 2).reshape(50,)

modelFit = SGDRegressor(learning_rate="adaptive", eta0=0.01, random_state=0, verbose=1,
                     shuffle=True, max_iter=2000, tol=1e-3, warm_start=True)
modelPartialFit = SGDRegressor(learning_rate="adaptive", eta0=0.01, random_state=0, verbose=1,
                     shuffle=True, max_iter=2000, tol=1e-3, warm_start=False)
# first fit some data
modelFit.fit(X, Y)
modelPartialFit.fit(X, Y)
# for both: Convergence after 50 epochs, Norm: 1.46, NNZs: 1, Bias: 2.000027, T: 2500, Avg. loss: 0.000237
print(modelFit.coef_, modelPartialFit.coef_) # for both: [1.46303288]

# now fit new data (zeros)
newX = X
newY = 0 * Y

# fits only for 1 epoch, Norm: 1.23, NNZs: 1, Bias: 1.208630, T: 50, Avg. loss: 1.595492:
modelPartialFit.partial_fit(newX, newY)

# Convergence after 49 epochs, Norm: 0.04, NNZs: 1, Bias: 0.000077, T: 2450, Avg. loss: 0.000313:
modelFit.fit(newX, newY)

print(modelFit.coef_, modelPartialFit.coef_) # [0.04245779] vs. [1.22919864]
newX = np.reshape([2], (-1, 1))
print(modelFit.predict(newX), modelPartialFit.predict(newX)) # [0.08499296] vs. [3.66702685]
Run Code Online (Sandbox Code Playgroud)


Kam*_*Kam 7

如果warm_start = False,则每次后续调用.fit()(在初始调用.fit()或后partial_fit())都将重置模型的可训练参数以进行初始化。如果warm_start = True,则每次后续调用.fit()(在初始调用.fit()或后partial_fit())将保留上次运行中的模型可训练参数的值,并在最初使用这些值。无论 的值如何warm_start,每次调用都partial_fit()将保留先前运行的模型参数并最初使用这些参数。

使用示例MLPRegressor

import sklearn.neural_network
import numpy as np
np.random.seed(0)
x = np.linspace(-1, 1, num=50).reshape(-1, 1)
y = (x * 1.5 + 2).reshape(50,)
cold_model = sklearn.neural_network.MLPRegressor(hidden_layer_sizes=(), warm_start=False, max_iter=1)
warm_model = sklearn.neural_network.MLPRegressor(hidden_layer_sizes=(), warm_start=True, max_iter=1)

cold_model.fit(x,y)
print cold_model.coefs_, cold_model.intercepts_
#[array([[0.17009494]])] [array([0.74643783])]
cold_model.fit(x,y)
print cold_model.coefs_, cold_model.intercepts_
#[array([[-0.60819342]])] [array([-1.21256186])]
#after second run of .fit(), values are completely different
#because they were re-initialised before doing the second run for the cold model

warm_model.fit(x,y)
print warm_model.coefs_, warm_model.intercepts_
#[array([[-1.39815616]])] [array([1.651504])]
warm_model.fit(x,y)
print warm_model.coefs_, warm_model.intercepts_
#[array([[-1.39715616]])] [array([1.652504])]
#this time with the warm model, params change relatively little, as params were
#not re-initialised during second call to .fit()

cold_model.partial_fit(x,y)
print cold_model.coefs_, cold_model.intercepts_
#[array([[-0.60719343]])] [array([-1.21156187])]
cold_model.partial_fit(x,y)
print cold_model.coefs_, cold_model.intercepts_
#[array([[-0.60619347]])] [array([-1.21056189])]
#with partial_fit(), params barely change even for cold model,
#as no re-initialisation occurs

warm_model.partial_fit(x,y)
print warm_model.coefs_, warm_model.intercepts_
#[array([[-1.39615617]])] [array([1.65350392])]
warm_model.partial_fit(x,y)
print warm_model.coefs_, warm_model.intercepts_
#[array([[-1.39515619]])] [array([1.65450372])]
#and of course the same goes for the warm model
Run Code Online (Sandbox Code Playgroud)


Ven*_*lam 5

.fit()首先,让我们看看和之间的区别.partial_fit()

.fit()会让你从头开始训练。因此,您可以将此视为模型只能使用一次的选项。如果您.fit()使用一组新数据再次调用,模型将基于新数据构建,并且不会受到先前数据集的影响。

.partial_fit()可以让您使用增量数据更新模型。因此,此选项可以对模型多次使用。当整个数据集无法加载到内存中时,这可能很有用,请参阅此处

如果两者.fit().partial_fit()都使用一次,那就没有什么区别了。

warm_start只能用于.fit(),它会让你从之前的 co-eff 开始学习fit()。现在听起来可能与目的相似partial_fit(),但推荐的方法是partial_fit()。可以对partial_fit()相同的增量数据进行几次,以提高学习效果。