具有多维参数的增量贝叶斯更新

Sin*_*Cos 5 python bayesian scipy pymc pymc3

我正在尝试将 PYMC3 用于贝叶斯模型,我想在新的看不见的数据上反复训练我的模型。我想我每次看到数据时都需要用先前训练过的模型的后验更新先验,类似于这里https://docs.pymc.io/notebooks/updating_priors.html 的实现方式。他们使用以下函数从样本中找到 KDE,并使用对 from_posterior 的调用替换模型中参数的每个原始定义。

def from_posterior(param, samples):
    smin, smax = np.min(samples), np.max(samples)
    width = smax - smin
    x = np.linspace(smin, smax, 100)
    y = stats.gaussian_kde(samples)(x)

    # what was never sampled should have a small probability but not 0,
    # so we'll extend the domain and use linear approximation of density on it
    x = np.concatenate([[x[0] - 3 * width], x, [x[-1] + 3 * width]])
    y = np.concatenate([[0], y, [0]])
    return Interpolated(param, x, y)
Run Code Online (Sandbox Code Playgroud)

这是我的原始模型。

def create_model(batsmen, bowlers, id1, id2, X):
    testval = [[-5,0,1,2,3.5,5] for i in range(0, 9)]
    l = [i for i in range(9)]
    model = pm.Model()
    with model:
        delta_1 = pm.Uniform("delta_1", lower=0, upper=1)
        delta_2 = pm.Uniform("delta_2", lower=0, upper=1)
        inv_sigma_sqr = pm.Gamma("sigma^-2", alpha=1.0, beta=1.0)
        inv_tau_sqr = pm.Gamma("tau^-2", alpha=1.0, beta=1.0)
        mu_1 = pm.Normal("mu_1", mu=0, sigma=1/pm.math.sqrt(inv_tau_sqr), shape=len(batsmen))
        mu_2 = pm.Normal("mu_2", mu=0, sigma=1/pm.math.sqrt(inv_tau_sqr), shape=len(bowlers))
        delta = pm.math.ge(l, 3) * delta_1 + pm.math.ge(l, 6) * delta_2
        eta = [pm.Deterministic("eta_" + str(i), delta[i] + mu_1[id1[i]] - mu_2[id2[i]]) for i in range(9)]
        cutpoints = pm.Normal("cutpoints", mu=0, sigma=1/pm.math.sqrt(inv_sigma_sqr), transform=pm.distributions.transforms.ordered, shape=(9,6), testval=testval)
        X_ = [pm.OrderedLogistic("X_" + str(i), cutpoints=cutpoints[i], eta=eta[i], observed=X[i]-1) for i in range(9)]
    return model
Run Code Online (Sandbox Code Playgroud)

在这里,问题是我的一些参数(例如 mu_1)是多维的。这就是我收到以下错误的原因:

ValueError: points have dimension 1, dataset has dimension 1500
Run Code Online (Sandbox Code Playgroud)

因为线 y = stats.gaussian_kde(samples)(x).

有人可以帮我完成多维参数的这项工作吗?我没有正确理解 KDE 是什么以及代码如何计算它。

先感谢您!!