标签: tensorflow-gpu

分布式Tensorflow：检查失败：大小> = 0

我正在使用keras 2.0.6。tensorflow的版本是1.3.0。

我的代码可以在theano后端上运行，但是在tensorflow后端上失败：

F tensorflow / core / framework / tensor_shape.cc：241]检查失败：大小> = 0（-14428307456与0）

我想知道是否有人能想到任何可能导致此的原因。

谢谢！

----更新-----

我使用tensorflow在PC上测试了完全相同的代码。它运行完美。

但是，当我在超级计算机上运行它时，它将抛出此错误。

尽管此错误看起来像是溢出，但它不可能没有在我的PC上溢出，而是在超级计算机上溢出。

我怀疑它来自张量流中用于分布式计算的错误。

tensorflow tensorflow-gpu

vol*_*fly

2017 08-02

5
推荐指数

1
解决办法

834
查看次数

在Tesla V100上未启用TF1.4的混合精度

我有兴趣测试我的神经网络(一个自动编码器,用作发生器+ CNN作为鉴别器),它使用3dconv/deconv层和新的Volta架构,并受益于混合精度训练.我使用CUDA 9和CudNN 7.0编译了Tensorflow 1.4的最新源代码,并将我的conv/deconv层使用的所有可训练变量转换为tf.float16.此外,我的所有输入和输出张量的大小都是8的倍数.

不幸的是,我没有看到这种配置有任何实质性的速度改进,训练时间与使用tf.float32大致相似.我的理解是,使用Volta架构和cuDNN 7.0,混合精度应该由TF自动检测,因此可以使用Tensor Core数学.我错了,或者我应该做些什么来启用它？我也尝试了TF1.5 nighlty版本,它似乎比我的自定义1.4更慢.

如果任何涉及Tensorflow的开发人员可以回答这个问题,我将不胜感激.

编辑:在与NVIDIA技术支持人员交谈之后,似乎在支持float16时,TF为简单的2D转换操作集成了混合精度加速,但现在不支持3D转换操作.

tesla tensorflow tensorflow-gpu

Jul*_*rda

2017 11-17

5
推荐指数

1
解决办法

1809
查看次数

tensorflow-gpu无法使用Blas GEMM启动失败

我安装了tensorflow-gpu来在我的GPU上运行我的tensorflow代码.但我不能让它运行.它继续给出上述错误.以下是我的示例代码,后跟错误堆栈跟踪:

import tensorflow as tf
import numpy as np

def check(W,X):
    return tf.matmul(W,X)


def main():
    W = tf.Variable(tf.truncated_normal([2,3], stddev=0.01))
    X = tf.placeholder(tf.float32, [3,2])
    check_handle = check(W,X)
    with tf.Session() as sess:
        tf.initialize_all_variables().run()
        num = sess.run(check_handle, feed_dict = 
            {X:np.reshape(np.arange(6), (3,2))})
        print(num)
if __name__ == '__main__':
    main()

Run Code Online (Sandbox Code Playgroud)

我的GPU是相当不错的GeForce GTX 1080 Ti,拥有11 GB的vram,并且没有其他任何重要的运行(只是chrome),你可以在nvidia-smi中看到:

Fri Aug  4 16:34:49 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 381.22                 Driver Version: 381.22                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage …

Run Code Online (Sandbox Code Playgroud)

nvidia tensorflow cudnn tensorflow-gpu

HIM*_*RAI

2017 08-05

4
推荐指数

2
解决办法

6549
查看次数

// tensorflow/core/grappler/costs中的可配置属性"deps"的非法模糊匹配:utils:尝试使用GPU支持构建Tensorflow时

我试图用命令在Ubuntu上构建Tensorflow,bazel build --config=opt --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --config=cuda //tensorflow/tools/pip_package:build_pip_package但运行后出现以下错误./configure:

Illegal ambiguous match on configurable attribute "deps" in //tensorflow/core/grappler/costs:utils:
@local_config_cuda//cuda:using_clang
@local_config_cuda//cuda:using_nvcc
Multiple matches are not allowed unless one is unambiguously more specialized.
ERROR: Analysis of target '//tensorflow/tools/pip_package:build_pip_package' failed; build aborted
INFO: Elapsed time: 30.960s
FAILED: Build did NOT complete successfully (91 packages loaded)

Run Code Online (Sandbox Code Playgroud)

我不确定发生了什么.似乎使用clang和nvcc之间存在冲突,但我认为在配置过程中我没有指定在任何地方使用nvcc../configure本应该没有任何问题.

奇怪的是,我无法在互联网上找到任何此类错误报告.

可能是因为我错误地安装了CUDA,还是更有可能是tensorflow配置错误？

Ubuntu 16.04 LTS
CUDA 8.0
cuDnn 7.0.1

ubuntu bazel tensorflow tensorflow-gpu

xji*_*xji

2017 08-12

4
推荐指数

2
解决办法

1537
查看次数

带有 CuDNNLSTM 层的 Keras 模型在生产服务器上不起作用

我使用 AWS p3 实例使用 GPU 加速训练以下模型：

x = CuDNNLSTM(128, return_sequences=True)(inputs)
x = Dropout(0.2)(x)
x = CuDNNLSTM(128, return_sequences=False)(x)
x = Dropout(0.2)(x)
predictions = Dense(1, activation='tanh')(x)
model = Model(inputs=inputs, outputs=predictions)

Run Code Online (Sandbox Code Playgroud)

训练后，我用 Keras 的save_model函数保存了模型，并将其移至没有 GPU 的单独生产服务器。

当我尝试在生产服务器上使用该模型进行预测时，它失败并显示以下错误：

没有注册任何 OpKernel 来支持具有这些属性的 Op 'CudnnRNN'。注册设备：[CPU]，注册内核：

我猜这是因为生产服务器没有 GPU 支持，但我希望这不会成为问题。有什么办法可以在没有 GPU 的生产服务器上使用这个模型？

python keras tensorflow tensorflow-gpu

amb*_*a88

2018 01-04

4
推荐指数

1
解决办法

1万
查看次数

微调与再培训

所以我正在学习如何使用Tensorflow来微调自定义数据集的Inception-v3模型.

我发现了两个与此相关的教程.一个是关于" 如何为新类别重新构建初始层的最终层 ",另一个是" 使用TensorFlow中的Inception训练您自己的图像分类器并进行微调 ".

我在虚拟机上完成了第一次再培训教程,只用了2-3个小时就完成了.对于相同的花朵数据集,我正在GPU上进行第二次微调教程,并且花了大约一整天来执行培训.

再培训和微调有什么区别？

我的印象是,两人都使用预先训练过的Inception v3模型,删除旧的顶层并在花卉照片上训练一个新的顶层.但我的理解可能是错的.

machine-learning neural-network tensorflow tensorflow-gpu

Nik*_*ik

lucky-day

3
推荐指数

1
解决办法

1818
查看次数

如何在 tf.contrib.learn.Experiment 中使用 train_and_evaluate 函数正确应用 dropout

我正在使用 tensorflow 高级 APItf.contrib.learn.Experiment来运行我的模型。我tf.nn.dropout在模型代码中应用并使用train_and_evaluate函数来训练模型。但是，我无法弄清楚如何仅在过程中的评估中将参数设置keep_prob为 1 （因为通常 dropout 只应在训练中使用）。tf.nn.dropouttrain_and_evaluate

python tensorflow tensorflow-gpu

Yan*_*oon

lucky-day

3
推荐指数

1
解决办法

987
查看次数

Tensorflow(GPU)与Numpy

所以我有两个使用梯度下降的线性回归实现.一个在Tensorflow,一个在Numpy.我发现Numpy中的那个比Tensorflow快3倍.这是我的代码 -

Tensorflow:

class network_cluster(object):
    def __init__(self, data_frame, feature_cols, label_cols):
        self.init_data(data_frame, feature_cols, label_cols)
        self.init_tensors()

    def init_data(self, data_frame, feature_cols, label_cols):
        self.data_frame = data_frame
        self.feature_cols = feature_cols
        self.label_cols = label_cols

    def init_tensors(self):
        self.features = tf.placeholder(tf.float32)
        self.labels = tf.placeholder(tf.float32)

        self.weights = tf.Variable(tf.random_normal((len(self.feature_cols), len(self.label_cols))))
        self.const = tf.Variable(tf.random_normal((len(self.label_cols),)))

    def linear_combiner(self):
        return tf.add(tf.matmul(self.features, self.weights), self.const)

    def predict(self):
        return self.linear_combiner()

    def error(self):
        return tf.reduce_mean(tf.pow(self.labels - self.predict(), 2), axis = 0)

    def learn_model(self, epocs = 100):
        optimizer = tf.train.AdadeltaOptimizer(1).minimize(self.error())

        error_rcd = []
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            for …

Run Code Online (Sandbox Code Playgroud)

python numpy tensorflow tensorflow-gpu

Cal*_*xc1

lucky-day

2
推荐指数

1
解决办法

676
查看次数

Building tensor flow from source using bazel on Ubuntu 16.04. Error is ---> Linking of rule '//tensorflow/contrib/lite/toco:toco' failed (Exit 1)

I am just trying to use my gpu with tensorflow on a Ubuntu 16.04 64bit system. I noticed while running some tensorflow commands that neither my cpu nor my gpu were using all of the configurations available. Thus, I updated the driver for my Nvidia GTX-970 card to x86_64-384.98, updated the toolkit to 9.0, and the cuDNN to 7.0.

I have verified the installation and can write some simple code using python to wrap Cuda language to access the GPU …

python bazel ubuntu-16.04 tensorflow-gpu

jsf*_*a11

lucky-day

2
推荐指数

1
解决办法

1958
查看次数

权重和偏差不在张量流中更新

我已经制作了这个神经网络,以确定一个房子是好买还是坏买.由于某些原因,代码不会更新权重和偏差.我的损失保持不变.这是我的代码:

import pandas as pd
import tensorflow as tf

data = pd.read_csv("E:/workspace_py/datasets/good_bad_buy.csv")

features = data.drop(['index', 'good buy'], axis = 1)
lbls = data.drop(['index', 'area', 'bathrooms', 'price', 'sq_price'], axis = 1)

features = features[0:20]
lbls = lbls[0:20]

print(features)
print(lbls)
n_examples = len(lbls)

# Model

# Hyper parameters

epochs = 100
learning_rate = 0.1
batch_size = 1

input_data = tf.placeholder('float', [None, 4])
labels = tf.placeholder('float', [None, 1])

weights = {
            'hl1': tf.Variable(tf.random_normal([4, 10])),
            'hl2': tf.Variable(tf.random_normal([10, 10])),
            'hl3': tf.Variable(tf.random_normal([10, 4])),
            'ol': tf.Variable(tf.random_normal([4, …

Run Code Online (Sandbox Code Playgroud)

python machine-learning neural-network tensorflow tensorflow-gpu

Par*_*roi

2017 09-18

1
推荐指数

1
解决办法

4797
查看次数

静态图很快.动态图很慢.有没有具体的基准证明这一点？

我看到了一些关于tensorflow和的基准pytorch.Tensorflow可能更快,但似乎不是更快,甚至有时更慢.

是否有关于静态图和动态图专门测试的基准测试,证明静态图比动态图快得多？

torch tensorflow mxnet pytorch tensorflow-gpu

sta*_*low

lucky-day

1
推荐指数

1
解决办法

545
查看次数

为TensorFlow C ++ API的会话选择特定的GPU

我怎样才能让tensorflow使用特定的gpu进行推断？

部分源代码

std::unique_ptr<tensorflow::Session> session;  
Status const load_graph_status = LoadGraph(graph_path, &session);
if (!load_graph_status.ok()) {
   LOG(ERROR) << "LoadGraph ERROR!!!!"<< load_graph_status;
   return -1;
}

std::vector<Tensor> resized_tensors;
Status const read_tensor_status = ReadTensorFromImageFile(image_path, &resized_tensors);
if (!read_tensor_status.ok()) { 
    LOG(ERROR) << read_tensor_status;
    return -1;
}

std::vector<Tensor> outputs;
Status run_status = session->Run({{input_layer, resized_tensor}},
                                   output_layer, {}, &outputs);

Run Code Online (Sandbox Code Playgroud)

到目前为止一切都很好，但是当我执行Run时，tensorflow总是选择相同的gpu，我是否有办法指定要执行的gpu？

如果您需要完整的源代码，我将它们放在pastebin上

编辑：看起来options.config.mutable_gpu_options（）-> set_visible_device_list（“ 0”）工作，但我不确定。

c++ tensorflow tensorflow-gpu

Ste*_*ing

2017 11-06

1
推荐指数

1
解决办法

3156
查看次数