小编Hen*_*hro的帖子

Tensorflow:为SVM估计器输入具有稀疏数据的管道

介绍:

我试图tensorflow.contrib.learn.python.learn.estimators.svm用稀疏数据训练tensorflow svm估计器.在github repo处使用稀疏数据的示例用法tensorflow/contrib/learn/python/learn/estimators/svm_test.py#L167(我不允许发布更多链接,因此这里是相对路径).

svm估计器期望作为参数,example_id_column并且feature_columns其中特征列应该派生类,FeatureColumn例如tf.contrib.layers.feature_column.sparse_column_with_hash_bucket.请参阅Github repo at tensorflow/contrib/learn/python/learn/estimators/svm.py#L85和tensorflow.org上的文档python/contrib.layers#Feature_columns.

题:

我如何设置输入管道以格式化稀疏数据,以便我可以使用tf.contrib.layers feature_columns之一作为svm估算器的输入.
具有许多功能的密集输入功能如何？

背景

我使用的a1a数据是LIBSVM网站上的数据集.该数据集具有123个特征(如果数据密集,则对应于123个feature_columns).我写了一个用户op来读取数据,tf.decode_csv()但是对于LIBSVM格式.op将标签返回为密集张量,将特征返回为稀疏张量.我的输入管道:

NUM_FEATURES = 123
batch_size = 200

# my op to parse the libsvm data
decode_libsvm_module = tf.load_op_library('./libsvm.so')

def input_pipeline(filename_queue, batch_size):
    with tf.name_scope('input'):
        reader = tf.TextLineReader(name="TextLineReader_")
        _, libsvm_row = reader.read(filename_queue, name="libsvm_row_")
        min_after_dequeue = 1000
        capacity = min_after_dequeue + 3 * batch_size
        batch = tf.train.shuffle_batch([libsvm_row], batch_size=batch_size,
                                       capacity=capacity, …

Run Code Online (Sandbox Code Playgroud)

classification svm sparse-matrix tensorflow

Hen*_*hro

lucky-day

5
推荐指数

1
解决办法

1926
查看次数