我在Ubuntu 14.04上安装了GPU版本的tensorflow.
我在GPU服务器上,tensorflow可以访问可用的GPU.
我想在CPU上运行tensorflow.
通常情况下,我可以使用env CUDA_VISIBLE_DEVICES=0GPU上运行.0.
如何在CPU之间进行选择?
我没有兴趣重写我的代码 with tf.device("/cpu:0"):
如何解释TensorFlow输出以在GPGPU上构建和执行计算图?
给定以下命令,使用python API执行任意tensorflow脚本.
python3 tensorflow_test.py> out
第一部分stream_executor似乎是它的加载依赖.
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
Run Code Online (Sandbox Code Playgroud)
什么是NUMA节点?
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
Run Code Online (Sandbox Code Playgroud)
我认为这是它找到可用GPU的时候
I …Run Code Online (Sandbox Code Playgroud) Ubuntu 14.04,CUDA版本7.5.18,每晚构建的Tensorflow
tf.nn.max_pool()在tensorflow中运行操作时,出现以下错误:
[tensorflow / stream_executor / cuda / cuda_dnn.cc:286]已加载的cudnn库:5005,但是源代码是针对4007编译的。如果使用二进制安装,请升级您的cudnn库以使其匹配。如果从源构建,请确保加载的库与您在编译配置期间指定的版本匹配。
W tensorflow / stream_executor / stream.cc:577]尝试使用不具有DNN支持的StreamExecutor执行DNN操作
追溯(最近一次通话):
...
如何在tensorflow的编译配置中指定我的cudnn版本?
如何在 TensorFlow 中重用变量?我想重用tf.contrib.layers.linear
with tf.variable_scope("root") as varscope:
inputs_1 = tf.constant(0.5, shape=[2, 3, 4])
inputs_2 = tf.constant(0.5, shape=[2, 3, 4])
outputs_1 = tf.contrib.layers.linear(inputs_1, 5)
varscope.reuse_variables()
outputs_2 = tf.contrib.layers.linear(inputs_2, 5)
Run Code Online (Sandbox Code Playgroud)
但它给了我以下结果
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-51-a40b9ec68e25> in <module>()
5 outputs_1 = tf.contrib.layers.linear(inputs_1, 5)
6 varscope.reuse_variables()
----> 7 outputs_2 = tf.contrib.layers.linear(inputs_2, 5)
...
ValueError: Variable root/fully_connected_1/weights does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?
Run Code Online (Sandbox Code Playgroud) 以下:scrapy的教程我做了一个简单的图像爬虫(刮掉Bugattis的图像).这在下面的实施例中说明.
但是,按照指南给我留下了一个不起作用的爬虫!它找到所有网址,但不下载图片.
我找到了一个鸭子胶带解决方案:替换ITEM_PIPELINES等等IMAGES_STORE;
ITEM_PIPELINES['scrapy.pipeline.images.FilesPipeline'] = 1 和
IMAGES_STORE - > FILES_STORE
但我不知道为什么会这样呢?我想使用scrapy记录的ImagePipeline.
例
settings.py
BOT_NAME = 'imagespider'
SPIDER_MODULES = ['imagespider.spiders']
NEWSPIDER_MODULE = 'imagespider.spiders'
ITEM_PIPELINES = {
'scrapy.pipelines.images.ImagesPipeline': 1,
}
IMAGES_STORE = "/home/user/Desktop/imagespider/output"
Run Code Online (Sandbox Code Playgroud)
items.py
import scrapy
class ImageItem(scrapy.Item):
file_urls = scrapy.Field()
files = scrapy.Field()
Run Code Online (Sandbox Code Playgroud)
imagespider.py
from imagespider.items import ImageItem
import scrapy
class ImageSpider(scrapy.Spider):
name = "imagespider"
start_urls = (
"https://www.find.com/search=bugatti+veyron",
)
def parse(self, response):
for elem in response.xpath("//img"):
img_url = elem.xpath("@src").extract_first()
yield ImageItem(file_urls=[img_url])
Run Code Online (Sandbox Code Playgroud) 我是C++/CUDA 的新手。我尝试通过递归内核的输出(在内核包装器中)来实现并行算法“ reduce ”,它能够处理任何类型的输入大小和线程大小,而不会增加渐近并行运行时间。
例如,在 Cuda 中实现 Max Reduce 是这个问题的最佳答案,当线程大小足够小时,他/她的实现基本上是顺序的。
但是,当我编译和运行它时,我不断收到“分段错误”..?
>> nvcc -o mycode mycode.cu
>> ./mycode
Segmentail fault.
Run Code Online (Sandbox Code Playgroud)
在带有 cuda 6.5 的 K40 上编译
这是内核,基本上与我将检查器链接为“越界”的SO帖子相同,但不同:
#include <stdio.h>
/* -------- KERNEL -------- */
__global__ void reduce_kernel(float * d_out, float * d_in, const int size)
{
// position and threadId
int pos = blockIdx.x * blockDim.x + threadIdx.x;
int tid = threadIdx.x;
// do reduction in global memory
for (unsigned …Run Code Online (Sandbox Code Playgroud) 我是CUDA的新手,我一直致力于"减少算法".
该算法适用于任何小于1 << 24的数组大小.
当我使用大小为1 << 25的数组时,程序在"总和"中返回0,这是错误的.总和应该是2 ^ 25
编辑 cuda-memcheck compiled_code
========= CUDA-MEMCHECK
@@STARTING@@
========= Program hit cudaErrorInvalidValue (error 11) due to "invalid argument" on CUDA API call to cudaLaunch.
========= Saved host backtrace up to driver entry point at error
========= Host Frame:/usr/lib64/libcuda.so.1 [0x2f2d83]
========= Host Frame:test [0x3b37e]
========= Host Frame:test [0x2b71]
========= Host Frame:test [0x2a18]
========= Host Frame:test [0x2a4c]
========= Host Frame:test [0x2600]
========= Host Frame:test [0x2904]
========= Host Frame:/lib64/libc.so.6 (__libc_start_main + 0xfd) [0x1ed5d]
========= Host Frame:test …Run Code Online (Sandbox Code Playgroud) 我需要将一个函数(不调用它)传递给另一个函数,但我需要为默认参数指定一个不同的值.
例如:
def func_a(input, default_arg=True):
pass
def func_b(function):
pass
func_b(func_a(default_arg=False))
Run Code Online (Sandbox Code Playgroud)
但是,这会调用 func_a()并将结果传递给func_b().
如何设置default_arg=False而不执行func_a?