如何在TF2的call()函数中获取batch_size？

Question

如何在TF2的call()函数中获取batch_size？

zza*_*bok 1 keras tensorflow tensorflow-datasets batchsize tensorflow2.0

我正在尝试在 TF2 模型中发挥作用batch_size。call()但是，我无法得到它，因为我知道的所有方法都返回None或张量而不是维度元组。

这是一个简短的例子

import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Model

class MyModel(Model):
    def __init__(self):
        super(MyModel, self).__init__()
    
    def call(self, x):
        print(len(x))
        print(x.shape)
        print(tf.size(x))
        print(np.shape(x))
        print(x.get_shape())
        print(x.get_shape().as_list())
        print(tf.rank(x))
        print(tf.shape(x))
        print(tf.shape(x)[0])
        print(tf.shape(x)[1])        
        return tf.random.uniform((2, 10))


m = MyModel()
m.compile(optimizer="Adam", loss="sparse_categorical_crossentropy", metrics=['accuracy'])
m.fit(np.array([[1,2,3,4], [5,6,7,8]]), np.array([0, 1]), epochs=1)

Run Code Online (Sandbox Code Playgroud)

输出是：

Tensor("my_model_26/strided_slice:0", shape=(), dtype=int32)
(None, 4)
Tensor("my_model_26/Size:0", shape=(), dtype=int32)
(None, 4)
(None, 4)
[None, 4]
Tensor("my_model_26/Rank:0", shape=(), dtype=int32)
Tensor("my_model_26/Shape_2:0", shape=(2,), dtype=int32)
Tensor("my_model_26/strided_slice_1:0", shape=(), dtype=int32)
Tensor("my_model_26/strided_slice_2:0", shape=(), dtype=int32)

1/1 [==============================] - 0s 1ms/step - loss: 3.1796 - accuracy: 0.0000e+00

Run Code Online (Sandbox Code Playgroud)

在本例中，我将(2,4)numpy 数组作为(2, )模型的输入和目标。但正如你所看到的，我无法正常batch_size工作call()。

我需要它的原因是因为我必须迭代batch_size在我的真实模型中是动态的张量。

例如，如果数据集大小为 10，批次大小为 3，则最后一个批次中的最后一个批次大小将为 1。因此，我必须动态知道批次大小。

谁能帮我？

张量流2.3.3
CUDA 10.2
蟒蛇3.6.9

Answer 1

nes*_*uno 5

这是因为您正在使用 TensorFlow（这是强制性的，因为 Keras 现在位于 TensorFlow 内部），并且通过使用 TensorFlow，您需要了解将动态图“编译”为静态图。

简而言之，您的call方法（在引擎盖下）是用@tf.function装饰器装饰的。

这个装饰器：

跟踪python函数的执行情况
将Python操作转换为TensorFlow操作（例如if a > b变成tf.cond(tf.greater(a,b), something, something_else)）
创建一个tf.Graph（静态图）
执行刚刚创建的静态图。

您的所有print调用都是在第一步（Python 执行跟踪）期间执行的，这就是为什么即使您训练模型，您也只能看到输出 1 次。

也就是说，要获取张量的运行时（动态形状），您必须使用tf.shape(x)，批量大小只是batch_size = tf.shape(x)[0]

请注意，如果您想查看形状（使用 print），则不能使用 print，但必须使用tf.print。

import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Model


class MyModel(Model):
    def __init__(self):
        super(MyModel, self).__init__()

    def call(self, x):

        shape = tf.shape(x)
        batch_size = shape[0]

        tf.print(shape, batch_size)

        return tf.random.uniform((2, 10))


m = MyModel()
m.compile(
    optimizer="Adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)
m.fit(np.array([[1, 2, 3, 4], [5, 6, 7, 8]]), np.array([0, 1]), epochs=1)

Run Code Online (Sandbox Code Playgroud)

有关静态和动态形状的更多信息：https://pgaleone.eu/tensorflow/2018/07/28/understanding-tensorflow-tensors-shape-static-dynamic/

有关 tf.function 行为的更多信息：https://pgaleone.eu/tensorflow/tf.function/2019/03/21/dissecting-tf-function-part-1/

注：这些文章是我写的。

归档时间：	4 年，7 月前
查看次数：	1744 次
最近记录：	4 年，7 月前