我的目标是用TensorFlow做大事,但我想从小做起.
我有小的灰度方块(有一点噪音),我想根据它们的颜色对它们进行分类(例如3类:黑色,灰色,白色).我编写了一个Python类来生成正方形和1-hot向量,并修改了它们的基本MNIST示例以提供它们.
但它不会学到任何东西 - 例如,对于3个类别,它总是猜测≈33%正确.
import tensorflow as tf
import generate_data.generate_greyscale
data_generator = generate_data.generate_greyscale.GenerateGreyScale(28, 28, 3, 0.05)
ds = data_generator.generate_data(10000)
ds_validation = data_generator.generate_data(500)
xs = ds[0]
ys = ds[1]
num_categories = data_generator.num_categories
x = tf.placeholder("float", [None, 28*28])
W = tf.Variable(tf.zeros([28*28, num_categories]))
b = tf.Variable(tf.zeros([num_categories]))
y = tf.nn.softmax(tf.matmul(x,W) + b)
y_ = tf.placeholder("float", [None,num_categories])
cross_entropy = -tf.reduce_sum(y_*tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
# let batch_size = 100 --> therefore there are 100 batches of training data …
Run Code Online (Sandbox Code Playgroud) 我修改了MNIST(28x28)Convnet教程代码以接受更大的图像(150x150).但是当我尝试训练时,我收到了这个错误(请参阅完整堆栈跟踪结束):
W tensorflow/core/common_runtime/executor.cc:1076] 0x2e97d30 Compute status: Invalid argument: ReluGrad input is not finite. : Tensor had NaN values
Run Code Online (Sandbox Code Playgroud)
这是我的代码.令人担忧的是,我在使用磁盘中的图像数据时会产生相同的错误,因为生成嘈杂的红色/蓝色/绿色方块并尝试按颜色对其进行分类.生成RGB数据的代码与扫描JPG图像数据的目录的代码不同.所以要么我在自己的数据中加载一些系统错误的方法,要么我提出的架构有问题.(我可以包含这些模块,但我担心它可能使这篇文章变得不可思议.)
编辑:我尝试了相同的代码与中等大图像(30x30),它的工作.那么错误可能与(150x150)问题的极高维度有关?
import tensorflow as tf
import numpy as np
import data.image_loader
###############################
##### Set hyperparameters #####
###############################
num_epochs = 2
width = 150
height = 150
num_categories = 2
num_channels = 3
batch_size = 100 # for my sanity
num_training_examples = 2000
num_test_examples = 200
num_batches = num_training_examples/batch_size
####################################################################################
##### It's convenient to define some methods to perform frequent routine tasks …
Run Code Online (Sandbox Code Playgroud) 我正在尝试为我的Caffe机器学习项目创建LMDB数据库。但是,LMDB在第一次尝试插入数据点时抛出错误,并指出环境mapsize已满。
这是尝试填充数据库的代码:
import numpy as np
from PIL import Image
import os
import lmdb
import random
# my data structure for holding image/label pairs
from serialization import DataPoint
class LoadImages(object):
def __init__(self, image_data_path):
self.image_data_path = image_data_path
self.dirlist = os.listdir(image_data_path)
# find the number of images that are to be read from disk
# in this case there are 370 images.
num = len(self.dirlist)
# shuffle the list of image files so that they are read in a random order
random.shuffle(self.dirlist) …
Run Code Online (Sandbox Code Playgroud)