使用extract_image_patches后重建图像

Question

使用extract_image_patches后重建图像

我有一个自动编码器,它将图像作为输入并生成一个新图像作为输出.

输入图像(1x1024x1024x3)在送入网络之前被分割为补丁(1024x32x32x3).

一旦我有输出,也有一批大小为1024x32x32x3的补丁,我希望能够重建1024x1024x3图像.我以为我只是简单地重塑了这个,但这就是发生的事情.

首先,Tensorflow读取的图像: 输入图像

我用以下代码修补了图像

patch_size = [1, 32, 32, 1]
patches = tf.extract_image_patches([image],
    patch_size, patch_size, [1, 1, 1, 1], 'VALID')
patches = tf.reshape(patches, [1024, 32, 32, 3])

Run Code Online (Sandbox Code Playgroud)

以下是此图片中的几个补丁:

修补输入#168 修补输入#169

但是,当我将这个补丁数据重新塑造成一个形状为梨状的图像时.

reconstructed = tf.reshape(patches, [1, 1024, 1024, 3])
converted = tf.image.convert_image_dtype(reconstructed, tf.uint8)
encoded = tf.image.encode_png(converted)

Run Code Online (Sandbox Code Playgroud)

重建输出

在该示例中,在修补和重建之间没有进行任何处理.我已经制作了一个可用于测试此行为的代码版本.要使用它,请运行以下命令:

echo "/path/to/test-image.png" > inputs.txt
mkdir images
python3 image_test.py inputs.txt images

Run Code Online (Sandbox Code Playgroud)

代码将为每个输入图像中的1024个补丁中的每个补丁创建一个输入图像,一个补丁图像和一个输出图像,因此如果您只关心保存所有补丁,请注释掉创建输入和输出图像的行.

有人请解释发生了什么:(

Answer 1

Har*_*lla 7

使用更新#2 - 您的任务的一个小例子：（TF 1.0）

考虑大小 (4,4,1) 的图像转换为大小 (4,2,2,1) 的块并将它们重建回图像。

import tensorflow as tf
image = tf.constant([[[1],   [2],  [3],  [4]],
                 [[5],   [6],  [7],  [8]],
                 [[9],  [10], [11],  [12]],
                [[13], [14], [15],  [16]]])

patch_size = [1,2,2,1]
patches = tf.extract_image_patches([image],
    patch_size, patch_size, [1, 1, 1, 1], 'VALID')
patches = tf.reshape(patches, [4, 2, 2, 1])
reconstructed = tf.reshape(patches, [1, 4, 4, 1])
rec_new = tf.space_to_depth(reconstructed,2)
rec_new = tf.reshape(rec_new,[4,4,1])

sess = tf.Session()
I,P,R_n = sess.run([image,patches,rec_new])
print(I)
print(I.shape)
print(P.shape)
print(R_n)
print(R_n.shape)

Run Code Online (Sandbox Code Playgroud)

输出：

[[[ 1][ 2][ 3][ 4]]
  [[ 5][ 6][ 7][ 8]]
  [[ 9][10][11][12]]
  [[13][14][15][16]]]
(4, 4, 1)
(4, 2, 2, 1)
[[[ 1][ 2][ 3][ 4]]
  [[ 5][ 6][ 7][ 8]]
  [[ 9][10][11][12]]
  [[13][14][15][16]]]
(4,4,1)

Run Code Online (Sandbox Code Playgroud)

更新 - 3 个通道（调试..）

仅适用于 p = sqrt(h)

import tensorflow as tf
import numpy as np
c = 3
h = 1024
p = 32

image = tf.random_normal([h,h,c])
patch_size = [1,p,p,1]
patches = tf.extract_image_patches([image],
   patch_size, patch_size, [1, 1, 1, 1], 'VALID')
patches = tf.reshape(patches, [h, p, p, c])
reconstructed = tf.reshape(patches, [1, h, h, c])
rec_new = tf.space_to_depth(reconstructed,p)
rec_new = tf.reshape(rec_new,[h,h,c])

sess = tf.Session()
I,P,R_n = sess.run([image,patches,rec_new])
print(I.shape)
print(P.shape)
print(R_n.shape)
err = np.sum((R_n-I)**2)
print(err)

Run Code Online (Sandbox Code Playgroud)

输出：

(1024, 1024, 3)
(1024, 32, 32, 3)
(1024, 1024, 3)
0.0

Run Code Online (Sandbox Code Playgroud)

更新 2

从extract_image_patches 的输出重建似乎很困难。使用其他函数来提取补丁并反转过程以重建这似乎更容易。

import tensorflow as tf
import numpy as np
c = 3
h = 1024
p = 128


image = tf.random_normal([1,h,h,c])

# Image to Patches Conversion
pad = [[0,0],[0,0]]
patches = tf.space_to_batch_nd(image,[p,p],pad)
patches = tf.split(patches,p*p,0)
patches = tf.stack(patches,3)
patches = tf.reshape(patches,[(h/p)**2,p,p,c])

# Do processing on patches
# Using patches here to reconstruct
patches_proc = tf.reshape(patches,[1,h/p,h/p,p*p,c])
patches_proc = tf.split(patches_proc,p*p,3)
patches_proc = tf.stack(patches_proc,axis=0)
patches_proc = tf.reshape(patches_proc,[p*p,h/p,h/p,c])

reconstructed = tf.batch_to_space_nd(patches_proc,[p, p],pad)

sess = tf.Session()
I,P,R_n = sess.run([image,patches,reconstructed])
print(I.shape)
print(P.shape)
print(R_n.shape)
err = np.sum((R_n-I)**2)
print(err)

Run Code Online (Sandbox Code Playgroud)

输出：

(1, 1024, 1024, 3)
(64, 128, 128, 3)
(1, 1024, 1024, 3)
0.0

Run Code Online (Sandbox Code Playgroud)

你可以在这里看到其他很酷的张量转换函数：https : //www.tensorflow.org/api_guides/python/array_ops

您的更新 #2 对我来说效果很好。我已经对其进行了清理并将其概括为任何补丁大小（例如 32x64）https://pastebin.com/mLUP8mgU (2认同)

Answer 2

Mar*_*ona 6

由于我也为此苦苦挣扎，因此我发布了一个可能对其他人有用的解决方案。诀窍是要认识到的倒数tf.extract_image_patches就是它的梯度，如此处所建议。由于此操作的梯度是在Tensorflow中实现的，因此很容易构建重构函数：

import tensorflow as tf
from keras import backend as K
import numpy as np

def extract_patches(x):
    return tf.extract_image_patches(
        x,
        (1, 3, 3, 1),
        (1, 1, 1, 1),
        (1, 1, 1, 1),
        padding="VALID"
    )

def extract_patches_inverse(x, y):
    _x = tf.zeros_like(x)
    _y = extract_patches(_x)
    grad = tf.gradients(_y, _x)[0]
    # Divide by grad, to "average" together the overlapping patches
    # otherwise they would simply sum up
    return tf.gradients(_y, _x, grad_ys=y)[0] / grad

# Generate 10 fake images, last dimension can be different than 3
images = np.random.random((10, 28, 28, 3)).astype(np.float32)
# Extract patches
patches = extract_patches(images)
# Reconstruct image
# Notice that original images are only passed to infer the right shape
images_reconstructed = extract_patches_inverse(images, patches) 

# Compare with original (evaluating tf.Tensor into a numpy array)
# Here using Keras session
images_r = images_reconstructed.eval(session=K.get_session())

print (np.sum(np.square(images - images_r))) 
# 2.3820458e-11

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，5 月前
查看次数：	4815 次
最近记录：	6 年，1 月前