NT_Xent对比损失函数的Tensorflow实现？

Question

NT_Xent对比损失函数的Tensorflow实现？

Ale*_*exP 1 python backpropagation cosine-similarity scikit-learn tensorflow

正如标题所示，我正在尝试基于 SimCLR 框架训练模型（见本文：https ://arxiv.org/pdf/2002.05709.pdf - NT_Xent 损失在等式（1）和算法 1 中说明））。

我设法创建了损失函数的 numpy 版本，但这不适合训练模型，因为 numpy 数组无法存储反向传播所需的信息。我很难将我的 numpy 代码转换为 Tensorflow。这是我的 numpy 版本：

import numpy as np
from sklearn.metrics.pairwise import cosine_similarity

# Define the contrastive loss function, NT_Xent
def NT_Xent(zi, zj, tau=1):
    """ Calculates the contrastive loss of the input data using NT_Xent. The
    equation can be found in the paper: https://arxiv.org/pdf/2002.05709.pdf
    
    Args:
        zi: One half of the input data, shape = (batch_size, feature_1, feature_2, ..., feature_N)
        zj: Other half of the input data, must have the same shape as zi
        tau: Temperature parameter (a constant), default = 1.

    Returns:
        loss: The complete NT_Xent constrastive loss
    """
    z = np.concatenate((zi, zj), 0)

    loss = 0
    for k in range(zi.shape[0]):
        # Numerator (compare i,j & j,i)
        i = k
        j = k + zi.shape[0]
        sim_ij = np.squeeze(cosine_similarity(z[i].reshape(1, -1), z[j].reshape(1, -1)))
        sim_ji = np.squeeze(cosine_similarity(z[j].reshape(1, -1), z[i].reshape(1, -1)))
        numerator_ij = np.exp(sim_ij / tau)
        numerator_ji = np.exp(sim_ji / tau)

        # Denominator (compare i & j to all samples apart from themselves)
        sim_ik = np.squeeze(cosine_similarity(z[i].reshape(1, -1), z[np.arange(z.shape[0]) != i]))
        sim_jk = np.squeeze(cosine_similarity(z[j].reshape(1, -1), z[np.arange(z.shape[0]) != j]))
        denominator_ik = np.sum(np.exp(sim_ik / tau))
        denominator_jk = np.sum(np.exp(sim_jk / tau))

        # Calculate individual and combined losses
        loss_ij = - np.log(numerator_ij / denominator_ik)
        loss_ji = - np.log(numerator_ji / denominator_jk)
        loss += loss_ij + loss_ji
    
    # Divide by the total number of samples
    loss /= z.shape[0]

    return loss

Run Code Online (Sandbox Code Playgroud)

我相当有信心这个函数会产生正确的结果（尽管速度很慢，因为我在网上看到它的其他实现是矢量化版本 - 例如 Pytorch 的这个：https : //github.com/Spijkervet/SimCLR/blob/ master/modules/nt_xent.py（我的代码对相同的输入产生相同的结果），但我没有看到它们的版本在数学上如何与论文中的公式等效，因此我尝试构建自己的）。

作为第一次尝试，我已将 numpy 函数转换为它们的 TF 等效函数（tf.concat、tf.reshape、tf.math.exp、tf.range 等），但我相信我唯一/主要的问题是 sklearn 的 cosine_similarity 函数返回一个 numpy 数组，我不知道如何在 Tensorflow 中自己构建这个函数。有任何想法吗？

Answer 1

Ale*_*exP 5

我设法自己弄明白了！我没有意识到余弦相似度函数“tf.keras.losses.CosineSimilarity”的 Tensorflow 实现

这是我的代码：

import tensorflow as tf

# Define the contrastive loss function, NT_Xent (Tensorflow version)
def NT_Xent_tf(zi, zj, tau=1):
    """ Calculates the contrastive loss of the input data using NT_Xent. The
    equation can be found in the paper: https://arxiv.org/pdf/2002.05709.pdf
    (This is the Tensorflow implementation of the standard numpy version found
    in the NT_Xent function).
    
    Args:
        zi: One half of the input data, shape = (batch_size, feature_1, feature_2, ..., feature_N)
        zj: Other half of the input data, must have the same shape as zi
        tau: Temperature parameter (a constant), default = 1.

    Returns:
        loss: The complete NT_Xent constrastive loss
    """
    z = tf.cast(tf.concat((zi, zj), 0), dtype=tf.float32)
    loss = 0
    for k in range(zi.shape[0]):
        # Numerator (compare i,j & j,i)
        i = k
        j = k + zi.shape[0]
        # Instantiate the cosine similarity loss function
        cosine_sim = tf.keras.losses.CosineSimilarity(axis=-1, reduction=tf.keras.losses.Reduction.NONE)
        sim = tf.squeeze(- cosine_sim(tf.reshape(z[i], (1, -1)), tf.reshape(z[j], (1, -1))))
        numerator = tf.math.exp(sim / tau)

        # Denominator (compare i & j to all samples apart from themselves)
        sim_ik = - cosine_sim(tf.reshape(z[i], (1, -1)), z[tf.range(z.shape[0]) != i])
        sim_jk = - cosine_sim(tf.reshape(z[j], (1, -1)), z[tf.range(z.shape[0]) != j])
        denominator_ik = tf.reduce_sum(tf.math.exp(sim_ik / tau))
        denominator_jk = tf.reduce_sum(tf.math.exp(sim_jk / tau))

        # Calculate individual and combined losses
        loss_ij = - tf.math.log(numerator / denominator_ik)
        loss_ji = - tf.math.log(numerator / denominator_jk)
        loss += loss_ij + loss_ji
    
    # Divide by the total number of samples
    loss /= z.shape[0]

    return loss

Run Code Online (Sandbox Code Playgroud)

如您所见，我基本上只是将 numpy 函数换成了 TF 等价物。一个主要的注意点是我必须在“cosine_sim”函数中使用“reduction=tf.keras.losses.Reduction.NONE”，这是为了保持“sim_ik”和“sim_jk”中的形状一致，否则由此产生的损失与我原来的 numpy 实现不匹配。

我还注意到单独计算 i,j 和 j,i 的分子是多余的，因为答案是相同的，所以我删除了该计算的一个实例。

当然，如果有人有更快的实现，我很高兴听到它！

归档时间：	5 年，7 月前
查看次数：	477 次
最近记录：	5 年，1 月前