Keras中WARP丢失的暗示

Question

Keras中WARP丢失的暗示

我正在尝试使用Keras API实现翘曲损失（成对排名函数的类型）。我有点卡住了如何成功。

经损失的定义取自lightFM文档：

对于给定的（用户，正项对），从所有剩余项中随机抽取一个负项。计算两个项目的预测；如果阴性项目的预测超过阳性项目的预测加上边距，请执行梯度更新以将阳性项目的排名提高到阴性项目的排名降低。如果没有等级违规，则继续对否定项进行采样，直到找到违规为止。

扭曲函数用于例如#hashtags的语义嵌入中，该标签是facebook AI研究发表的论文。在本文中，他们尝试为短文本预测最具代表性的主题标签。在哪里'user'被认为是短文本，'positive item'是短文本的主题标签，并且negative items是从“主题标签查找”中统一采样的一些随机主题标签。

我正在遵循另一种三重态损失的含义来创建变形：github

我的理解是，对于每个数据点，我将有3个输入。嵌入示例（“ semi”伪代码）：

sequence_input = Input(shape=(100, ), dtype='int32') # 100 features per data point
positive_example = Input(shape=(1, ), dtype='int32', name="positive") # the one positive example
negative_examples = Input(shape=(1000,), dtype='int32', name="random_negative_examples") # 1000 random negative examples.

#map data points to already created embeddings
embedded_seq_input = embedded_layer(sequence_input)
embedded_positive = embedded_layer(positive_example)
embedded_negatives = embedded_layer(negative_examples)

conv1 = Convolution1D(...)(embeddded_seq_input)
               .
               .
               .
z = Dense(vector_size_of_embedding,activation="linear")(convN)

loss = merge([z, embedded_positive, embedded_negatives],mode=warp_loss)
                         .
                         .
                         .

Run Code Online (Sandbox Code Playgroud)

在哪里warp_loss（我假设得到1000个随机负数而不是全部取负，并且分数来自余弦模拟）：

def warp_loss(X):
    # pseudocode
    z, positive, negatives = X
    positive_score = cosinus_similatiry(z, positive)
    counts = 1
    loss = 0
    for negative in negatives:
        score = cosinus_similatiry(z, negative)
        if score > positive_score:
           loss = ((number_of_labels - 1) / counts) * (score + 1 - positive_score
        else:
           counts += 1
    return loss

Run Code Online (Sandbox Code Playgroud)

很好地描述了如何计算变形：post

我不确定这是否是正确的方法，但是我找不到实现warp_loss伪函数的方法。我可以使用来计算余弦值，merge([x,u],mode='cos')但这假定尺寸相同。所以我不确定如何将merge模式cos用于多个否定示例，因此我尝试创建自己的模式warp_loss。

任何见解，实施类似的例子，评论都是有用的。

Answer 1

egg*_*ie5 0

首先，我认为在批量训练范例中实现 WARP 是不可能的。因此你不能在 Keras 中实现 WARP。这是因为 WARP 本质上是顺序的，因此它无法像 Keras 那样处理分成批次的数据。我想如果你进行完全随机的批次，你就可以成功。

通常，对于 WARP，您会包含的边距1，但如本文中所示，您可以将其视为超参数：

if neg_score > pos_score-1: #margin of 1
  loss = log(num_items / counts) #loss weighted by sample count
  loss = max(1, loss) #this looks like same thing you were doing in diff way

Run Code Online (Sandbox Code Playgroud)

这优于其前身 BPR，因为它优化了 top k 精度而不是平均精度。

归档时间：	8 年，2 月前
查看次数：	1369 次
最近记录：	7 年，1 月前