可以说我有一个带有重复索引的稀疏张量,并且在它们重复的地方,我想合并值(将它们加起来),这样做的最佳方法是什么?
例:
indicies = [[1, 1], [1, 2], [1, 2], [1, 3]]
values = [1, 2, 3, 4]
object = tf.SparseTensor(indicies, values, shape=[10, 10])
result = tf.MAGIC(object)
Run Code Online (Sandbox Code Playgroud)
结果应该是具有以下值的备用张量(或具体值!):
indicies = [[1, 1], [1, 2], [1, 3]]
values = [1, 5, 4]
Run Code Online (Sandbox Code Playgroud)
我唯一需要做的就是将索引连接在一起以创建索引哈希,将其应用于第三个维度,然后减少该第三个维度的总和。
indicies = [[1, 1, 11], [1, 2, 12], [1, 2, 12], [1, 3, 13]]
sparse_result = tf.sparse_reduce_sum(sparseTensor, reduction_axes=2, keep_dims=true)
Run Code Online (Sandbox Code Playgroud)
但这感觉非常丑陋
这是使用的解决方案tf.segment_sum
。想法是将索引线性化到一维空间,使用tf.unique
,run tf.segment_sum
和获得唯一索引,然后将索引转换回ND空间。
indices = tf.constant([[1, 1], [1, 2], [1, 2], [1, 3]])
values = tf.constant([1, 2, 3, 4])
# Linearize the indices. If the dimensions of original array are
# [N_{k}, N_{k-1}, ... N_0], then simply matrix multiply the indices
# by [..., N_1 * N_0, N_0, 1]^T. For example, if the sparse tensor
# has dimensions [10, 6, 4, 5], then multiply by [120, 20, 5, 1]^T
# In your case, the dimensions are [10, 10], so multiply by [10, 1]^T
linearized = tf.matmul(indices, [[10], [1]])
# Get the unique indices, and their positions in the array
y, idx = tf.unique(tf.squeeze(linearized))
# Use the positions of the unique values as the segment ids to
# get the unique values
values = tf.segment_sum(values, idx)
# Go back to N-D indices
y = tf.expand_dims(y, 1)
indices = tf.concat([y//10, y%10], axis=1)
tf.InteractiveSession()
print(indices.eval())
print(values.eval())
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
1331 次 |
最近记录: |