anw*_*war 8 python image keras tensorflow google-colaboratory
我正在训练一个分类器,并确保所有图片都是 jpg,但仍然出现此错误:InvalidArgumentError:未知的图像文件格式。需要 JPEG、PNG、GIF、BMP 之一。[[{{节点decode_image/DecodeImage}}]] [[IteratorGetNext]] [Op:__inference_train_function_1481]
我尝试在较小的数据集上进行训练,而且它们都是 jpg,没有问题
这是代码:
import numpy as np
import tensorflow as tf
from tensorflow import keras
dataset = keras.preprocessing.image_dataset_from_directory(
'/content/drive/MyDrive/fi_dataset/train', batch_size=64, image_size=(200, 200))
dense = keras.layers.Dense(units=16)
inputs = keras.Input(shape=(None, None, 3))
from tensorflow.keras import layers
x = CenterCrop(height=150, width=150)(inputs)
x = Rescaling(scale=1.0 / 255)(x)
x = layers.Conv2D(filters=32, kernel_size=(3, 3), activation="relu")(x)
x = layers.MaxPooling2D(pool_size=(3, 3))(x)
x = layers.Conv2D(filters=32, kernel_size=(3, 3), activation="relu")(x)
x = layers.MaxPooling2D(pool_size=(3, 3))(x)
x = layers.Conv2D(filters=32, kernel_size=(3, 3), activation="relu")(x)
x = layers.GlobalAveragePooling2D()(x)
num_classes = 1
outputs = layers.Dense(num_classes, activation="sigmoid")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
data = np.random.randint(0, 256, size=(64, 200, 200, 3)).astype("float32")
processed_data = model(data)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=[keras.metrics.binary_accuracy],)
history=model.fit(dataset, epochs=10)
Run Code Online (Sandbox Code Playgroud)
小智 11
实际上它可能有一个扩展名 jpg 但采用 tiff 格式。要更进一步,您可以添加一些代码......
如果您想检查图像类型,而不是扩展名,请尝试上面代码的修改版本:
import os
import cv2
import imghdr
def check_images( s_dir, ext_list):
bad_images=[]
bad_ext=[]
s_list= os.listdir(s_dir)
for klass in s_list:
klass_path=os.path.join (s_dir, klass)
print ('processing class directory ', klass)
if os.path.isdir(klass_path):
file_list=os.listdir(klass_path)
for f in file_list:
f_path=os.path.join (klass_path,f)
tip = imghdr.what(f_path)
if ext_list.count(tip) == 0:
bad_images.append(f_path)
if os.path.isfile(f_path):
try:
img=cv2.imread(f_path)
shape=img.shape
except:
print('file ', f_path, ' is not a valid image file')
bad_images.append(f_path)
else:
print('*** fatal error, you a sub directory ', f, ' in class directory ', klass)
else:
print ('*** WARNING*** you have files in ', s_dir, ' it should only contain sub directories')
return bad_images, bad_ext
source_dir =r'c:\temp\people\storage'
good_exts=['jpg', 'png', 'jpeg', 'gif', 'bmp' ] # list of acceptable extensions
bad_file_list, bad_ext_list=check_images(source_dir, good_exts)
if len(bad_file_list) !=0:
print('improper image files are listed below')
for i in range (len(bad_file_list)):
print (bad_file_list[i])
else:
print(' no improper image files were found')
Run Code Online (Sandbox Code Playgroud)
Python 的标准库中有许多模块,其中一个可以提供帮助的模块是imghdr。它可以让您识别文件、字节流或类似路径的对象中包含的图像类型。imghdr 可以识别以下图像类型:rgb、gif、pbm、pgm、ppm、tiff、rast、xbm、jpeg / jpg、bmp、png、webp和exr。
当你说你确定它们是jpg时,你是如何验证的?仅仅因为扩展名是 .jpg 并不意味着该文件是真正的 jpg 图像。我建议您运行下面的代码来查看哪个图像可能有缺陷。
import os
import cv2
def check_images( s_dir, ext_list):
bad_images=[]
bad_ext=[]
s_list= os.listdir(s_dir)
for klass in s_list:
klass_path=os.path.join (s_dir, klass)
print ('processing class directory ', klass)
if os.path.isdir(klass_path):
file_list=os.listdir(klass_path)
for f in file_list:
f_path=os.path.join (klass_path,f)
index=f.rfind('.')
ext=f[index+1:].lower()
if ext not in ext_list:
print('file ', f_path, ' has an invalid extension ', ext)
bad_ext.append(f_path)
if os.path.isfile(f_path):
try:
img=cv2.imread(f_path)
shape=img.shape
except:
print('file ', f_path, ' is not a valid image file')
bad_images.append(f_path)
else:
print('*** fatal error, you a sub directory ', f, ' in class directory ', klass)
else:
print ('*** WARNING*** you have files in ', s_dir, ' it should only contain sub directories')
return bad_images, bad_ext
source_dir =r'c:\temp\people\storage'
good_exts=['jpg', 'png', 'jpeg', 'gif', 'bmp' ] # list of acceptable extensions
bad_file_list, bad_ext_list=check_images(source_dir, good_exts)
if len(bad_file_list) !=0:
print('improper image files are listed below')
for i in range (len(bad_file_list)):
print (bad_file_list[i])
else:
print(' no improper image files were found')
Run Code Online (Sandbox Code Playgroud)
即使这可能还不够,因为它会检查文件的扩展名。实际上它可能有一个扩展名 jpg 但采用 tiff 格式。更进一步,您可以添加一些代码,如果扩展名不在良好的扩展名列表中,您可以读取图像,如果它有效,则使用 cv2 将其转换为 jpg,然后将其写回文件。
| 归档时间: |
|
| 查看次数: |
12434 次 |
| 最近记录: |