我正在使用TensorFlow教程,该教程使用"怪异"格式上传数据.我想使用NumPy或pandas格式的数据,以便我可以将它与scikit-learn结果进行比较.
我从Kaggle获得了数字识别数据:https://www.kaggle.com/c/digit-recognizer/data .
这里是TensorFlow教程的代码(工作正常):
# Stuff from tensorflow tutorial
import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.placeholder("float", shape=[None, 784])
y_ = tf.placeholder("float", shape=[None, 10])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
y = tf.nn.softmax(tf.matmul(x, W) + b)
cross_entropy = -tf.reduce_sum(y_ * tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)
Run Code Online (Sandbox Code Playgroud)
在这里,我读取数据,去掉目标变量并将数据分成测试和训练数据集(这一切都正常):
# Read dataframe from training data
csvfile='train.csv'
from pandas import DataFrame, read_csv
df = read_csv(csvfile)
# Strip off the target data and make it a separate …Run Code Online (Sandbox Code Playgroud)