具有增强图像和其他功能的 Keras 迭代器

Question

具有增强图像和其他功能的 Keras 迭代器

Lam*_*sti 10 python conv-neural-network keras data-augmentation

假设您有一个数据集，其中包含图像和.csv每个图像的一些数据。您的目标是创建一个具有卷积分支和另一个分支（在我的情况下为 MLP）的 NN。

现在，有很多关于如何创建网络的指南（这里有一个，另一个），这不是问题。

这里的问题是我如何以[[convolution_input, other_features], target]何时convolution_input来自添加增强图像的 Keras流的形式创建迭代器。ImageDataGenerator

更具体地说，当第 n 个图像（可能是增强图像或不是增强图像）被馈送到 NN 时，我希望它在other_features.

我发现很少尝试（这里和这里，第二个看起来很有希望，但我无法弄清楚如何处理增强图像）正是这样做的，但他们似乎没有考虑到 Keras 生成器可能进行的数据集操作做。

Answer 1

ven*_*nan 5

假设您有一个 csv，这样您的图像和其他功能都在文件中。

其中id代表图像名称，然后是特征，然后是您的目标，（用于分类的类，用于回归的数字）

|         id          | feat1 | feat2 | feat3 | class |
|---------------------|-------|-------|-------|-------|
| 1_face_IMG_NAME.jpg |   1   |   0   |   1   |   A   |
| 3_face_IMG_NAME.jpg |   1   |   0   |   1   |   B   |
| 2_face_IMG_NAME.jpg |   1   |   0   |   1   |   A   |
|         ...         |  ...  |  ...  |  ...  |  ...  |

Run Code Online (Sandbox Code Playgroud)

首先让我们定义一个数据生成器，然后我们可以覆盖它。

让我们从熊猫数据帧中的 csv 读取数据，并使用 kerasflow_from_dataframe从数据帧中读取数据。

df = pandas.read_csv("dummycsv.csv")
datagen = ImageDataGenerator(rescale=1/255.)
generator = datagen.flow_from_dataframe(
                df,
                directory="out/",
                x_col="id",
                y_col=df.columns[1:],
                class_mode="raw",
                batch_size=1)

Run Code Online (Sandbox Code Playgroud)

您始终可以在ImageDataGenerator.

上面代码中需要注意的flow_from_dataframe是

x_col = 图像名称

y_col= 通常带有类名的列，但让我们稍后通过首先提供 csv 中的所有其他列来覆盖它。即feat_1，feat_2.... 直到 class_label

class_mode= raw，建议生成器按y原样返回所有值。

现在让我们覆盖/继承上面的生成器并创建一个新的生成器，使其返回 [img, otherfeature], [target]

这是带有注释的代码作为解释：

def my_custom_generator():
    # to keep track of complete epoch
    count = 0 
    while True:
        if count == len(df.index):
            # if the count is matching with the length of df, 
            # the one pass is completed, so reset the generator
            generator.reset()
            break
        count += 1
        # get the data from the generator
        data = generator.next()

        # the data looks like this [[img,img] , [other_cols,other_cols]]  based on the batch size        
        imgs = []
        cols = []
        targets = []

        # iterate the data and append the necessary columns in the corresponding arrays 
        for k in range(batch_size):
            # the first array contains all images
            imgs.append(data[0][k])
      
            # the second array contains all features with last column as class, so [:-1]
            cols.append(data[1][k][:-1])

            # the last column in the second array from data is the class
            targets.append(data[1][k][-1])

        # this will yield the result as you expect.
        yield [imgs,cols], targets

Run Code Online (Sandbox Code Playgroud)

为您的验证生成器创建类似的函数。使用train_test_split，如果你需要它分割你的数据帧，创建2台发电机和覆盖它们。

model.fit_generator像这样传入函数

model.fit_generator(my_custom_generator(),.....other params)

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，10 月前
查看次数：	744 次
最近记录：	5 年，6 月前