reshape 的输入是一个有 37632 个值的张量，但请求的形状有 150528

Question

reshape 的输入是一个有 37632 个值的张量，但请求的形状有 150528

我有同样的问题：重塑的输入是一个具有 37632 个值的张量，但请求的形状有 150528 个。

 writer = tf.python_io.TFRecordWriter("/home/henson/Desktop/vgg/test.tfrecords")  # ??????

for index, name in enumerate(classes):
    class_path = cwd + name +'/'
    for img_name in os.listdir(class_path):
        img_path = class_path + img_name  # ????????
    img = Image.open(img_path)
    img = img.resize((224, 224))
    img_raw = img.tobytes()  # ???????????
    example = tf.train.Example(features=tf.train.Features(feature={
        "label": tf.train.Feature(int64_list=tf.train.Int64List(value=[index])),
        'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))
    }))  # example???label?image??????
    writer.write(example.SerializeToString())  # ???????

writer.close()


def read_and_decode(filename):  # ??dog_train.tfrecords
    filename_queue = tf.train.string_input_producer([filename])  # ????queue??

reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)  # ????????
features = tf.parse_single_example(serialized_example,
                                   features={
                                       'label': tf.FixedLenFeature([], tf.int64),
                                       'img_raw': tf.FixedLenFeature([], tf.string),
                                   })  # ?image???label???

img = tf.decode_raw(features['img_raw'], tf.uint8)
img = tf.reshape(img, [224, 224, 3])  # reshape?128*128?3????
img = tf.cast(img, tf.float32) * (1. / 255) - 0.5  # ?????img??
label = tf.cast(features['label'], tf.int32)  # ?????label??
print(img,label)
return img, label

images, labels = read_and_decode("/home/henson/Desktop/vgg/TFrecord.tfrecords")
print(images,labels)
images, labels = tf.train.shuffle_batch([images, labels], batch_size=20, capacity=16*20, min_after_dequeue=8*20)

Run Code Online (Sandbox Code Playgroud)

我认为我已将 img 调整为 224*224，并重新调整为 [224,224,3]，但它不起作用。我怎么能做到？

Answer 1

小智 6

这个问题基本上与CNN架构的形状有关。假设我定义了如图所示的架构 int 编码我们按以下方式定义权重和偏差如果我们看到（权重）让我们开始

wc1 在这一层我定义了 32 个 3x3 大小的过滤器将被应用

wc2 在这一层我定义了 64 个 3x3 大小的过滤器将被应用

wc3 在这一层我定义了 128 个 3x3 大小的过滤器将被应用

wd1 38*38*128 很有趣（它来自哪里）。

在架构中，我们还定义了 maxpooling 概念。请参阅每一步中的架构图片 1.让我们解释一下假设您的输入图像为 300 x 300 x 1（在图片中为 28x28x1） 2.（如果定义的步幅设置为 1）每个过滤器将有一个 300x300x1 的图片，因此应用后32 个 3x3 的过滤器，我们将有 32 张 300x300 的图片，因此收集的图像将是 300x300x32

3.Maxpooling后如果（Strides=2取决于你通常定义的是2）图像大小将从300 x 300 x 32变为150 x 150 x 32

（如果定义的步幅设置为 1）现在每个过滤器将有一个 150x150x32 的图片，所以在应用 3x3 的 64 个过滤器后，我们将有 64 张 300x300 的图片，因此收集的图像将是 150x150x(32x64)

5.Maxpooling之后如果（Strides=2取决于你通常定义的它是2）图像大小将从150x150x（32x64）变为75 x 75 x（32x64）

（如果定义的步幅设置为 1）现在每个过滤器将有一个 75 x 75 x (32x64) 图片所以应用 3x3 的 64 个过滤器后，我们将有 128 张 75 x 75 x (32x64) 的图片，因此收集的图像将为 75 x 75 x (32x64x128)

7.Maxpooling 之后，因为图像的尺寸是 75x75（奇数尺寸使其偶数）所以需要先填充（如果填充定义 ='Same'）然后它将更改为 76x76（偶数）** 如果（Strides=2 取决于）您通常定义的是 2) 图像大小将从 76x76x(32x64x128) 变为 **38 x 38 x (32x64x128)

现在在编码图片中看到'wd1'这里是38*38*128

归档时间：	8 年，3 月前
查看次数：	11524 次
最近记录：	6 年前