为什么张量流中的`tf.nn.nce_loss`无法在GPU上运行?

Hui*_* OU 6 python gpu tensorflow

代码tensorflow/examples/tutorials/word2vec/word2vec_basic.py有一个评论# Ops and variables pinned to the CPU because of missing GPU implementation.我还发现tf.nn.nce_lossGPU无法实现该操作.那么为什么tf.nn.nce_loss无法在GPU上运行?

我曾经log_device_placement看过MUST BE CPUOps.结果如下:

nce_loss/LogUniformCandidateSampler: (LogUniformCandidateSampler): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/LogUniformCandidateSampler: (LogUniformCandidateSampler)/job:localhost/replica:0/task:0/cpu:0
nce_loss/ComputeAccidentalHits: (ComputeAccidentalHits): /job:localhost/replica:0/task:0cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/ComputeAccidentalHits: (ComputeAccidentalHits)/job:localhost/replica:0/task:0/cpu:0
nce_loss/SparseToDense: (SparseToDense): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/SparseToDense: (SparseToDense)/job:localhost/replica:0/task:0/cpu:0
nce_loss/concat: (ConcatV2): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/concat: (ConcatV2)/job:localhost/replica:0/task:0/cpu:0
nce_loss/LogUniformCandidateSampler: (LogUniformCandidateSampler): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/LogUniformCandidateSampler: (LogUniformCandidateSampler)/job:localhost/replica:0/task:0/cpu:0
nce_loss/ComputeAccidentalHits: (ComputeAccidentalHits): /job:localhost/replica:0/task:0cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/ComputeAccidentalHits: (ComputeAccidentalHits)/job:localhost/replica:0/task:0/cpu:0
nce_loss/SparseToDense: (SparseToDense): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/SparseToDense: (SparseToDense)/job:localhost/replica:0/task:0/cpu:0
nce_loss/concat: (ConcatV2): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/concat: (ConcatV2)/job:localhost/replica:0/task:0/cpu:0
nce_loss/LogUniformCandidateSampler: (LogUniformCandidateSampler): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/LogUniformCandidateSampler: (LogUniformCandidateSampler)/job:localhost/replica:0/task:0/cpu:0
nce_loss/ComputeAccidentalHits: (ComputeAccidentalHits): /job:localhost/replica:0/task:0cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/ComputeAccidentalHits: (ComputeAccidentalHits)/job:localhost/replica:0/task:0/cpu:0
nce_loss/SparseToDense: (SparseToDense): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/SparseToDense: (SparseToDense)/job:localhost/replica:0/task:0/cpu:0
nce_loss/concat: (ConcatV2): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/concat: (ConcatV2)/job:localhost/replica:0/task:0/cpu:0
nce_loss/LogUniformCandidateSampler: (LogUniformCandidateSampler): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/LogUniformCandidateSampler: (LogUniformCandidateSampler)/job:localhost/replica:0/task:0/cpu:0
nce_loss/ComputeAccidentalHits: (ComputeAccidentalHits): /job:localhost/replica:0/task:0cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/ComputeAccidentalHits: (ComputeAccidentalHits)/job:localhost/replica:0/task:0/cpu:0
nce_loss/SparseToDense: (SparseToDense): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/SparseToDense: (SparseToDense)/job:localhost/replica:0/task:0/cpu:0
nce_loss/concat: (ConcatV2): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/concat: (ConcatV2)/job:localhost/replica:0/task:0/cpu:0
nce_loss/LogUniformCandidateSampler: (LogUniformCandidateSampler): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/LogUniformCandidateSampler: (LogUniformCandidateSampler)/job:localhost/replica:0/task:0/cpu:0
nce_loss/ComputeAccidentalHits: (ComputeAccidentalHits): /job:localhost/replica:0/task:0cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/ComputeAccidentalHits: (ComputeAccidentalHits)/job:localhost/replica:0/task:0/cpu:0
nce_loss/SparseToDense: (SparseToDense): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/SparseToDense: (SparseToDense)/job:localhost/replica:0/task:0/cpu:0
nce_loss/concat: (ConcatV2): /job:localhost/replica:0/task:0/cpu:0
I tensorflow/core/common_runtime/simple_placer.cc:841] nce_loss/concat: (ConcatV2)/job:localhost/replica:0/task:0/cpu:0
Run Code Online (Sandbox Code Playgroud)

但是,我仍然不知道哪个特殊的Ops会导致这个问题.是否有任何特殊的Ops无法通过GPU实现?或者这是由于tensorflow本身缺乏实现?

小智 0

您也可以使用 GPU 加速器执行此示例。
请参阅tensorflow支持的设备https://www.tensorflow.org/guide/using_gpu#supported_devices
您必须添加以下代码行

with tf.device('/gpu:0'):
Run Code Online (Sandbox Code Playgroud)

并删除#with tf.device('/cpu:0'):

基本上在这一行用gpu替换cpu: https://github.com/tensorflow/tensorflow/blob/bd965956535b1026a0769d04c3f6419d020f5f9c/tensorflow/examples/tutorials/word2vec/word2vec_basic.py#L189

希望这可以帮助。