小编McA*_*gus的帖子

在Tensorflow中添加GPU Op

我正在尝试在本文档之后宽松地向TensorFlow添加新操作。不同之处在于我正在尝试实现基于GPU的操作。我要添加的操作是此处的cuda操作（cuda_op.py，cuda_op_kernel.cc，cuda_op_kernel.cu.cc）。我正在尝试在tensorflow之外编译这些文件，并使用tf.load_op_library它们将它们拉入。我进行了一些更改，所以这里是我的文件：

cuda_op_kernel.cc

#include "tensorflow/core/framework/op.h"
#include "tensorflow/core/framework/shape_inference.h"
#include "tensorflow/core/framework/op_kernel.h"

using namespace tensorflow;  // NOLINT(build/namespaces)

REGISTER_OP("AddOne")
    .Input("input: int32")
    .Output("output: int32")
    .SetShapeFn([](::tensorflow::shape_inference::InferenceContext* c) {
      c->set_output(0, c->input(0));
      return Status::OK();
    });

void AddOneKernelLauncher(const int* in, const int N, int* out);

class AddOneOp : public OpKernel {
 public:
  explicit AddOneOp(OpKernelConstruction* context) : OpKernel(context) {}

  void Compute(OpKernelContext* context) override {
    // Grab the input tensor
    const Tensor& input_tensor = context->input(0);
    auto input = input_tensor.flat<int32>();

    // Create an output tensor
    Tensor* output_tensor …

Run Code Online (Sandbox Code Playgroud)

c++ python cuda tensorflow tensorflow-gpu

McA*_*gus

2017 06-08

5
推荐指数

1
解决办法

2268
查看次数

如何将我自己的类对象存储到hdf5中？

我创建了一个类来保存我的研究的实验结果（我是一名电子工程博士生），例如

class Trial:
    def __init__(self, subID, triID):
        self.filePath = '' # file path of the folder
        self.subID = -1    # int
        self.triID = -1    # int
        self.data_A = -1   # numpy array
        self.data_B = -1   # numpy array
        ......

Run Code Online (Sandbox Code Playgroud)

它是许多 bool、int 和 numpy 数组的混合。你明白了。我读到如果数据是 hdf5 格式，加载速度会更快。我可以用我的数据（我的对象的 python 列表）来做到这一点吗Trial？

请注意，stackoverflow 上也有类似的问题。但它只有一个答案，并不能回答问题。相反，它将 OP 的自定义类分解为基本数据类型并将它们存储到单独的数据集中。我并不反对这样做，但我想知道这是否是唯一的方法，因为它违背了面向对象的哲学。

python numpy hdf5 h5py

Chr*_*ris

2020 03-28

5
推荐指数

2
解决办法

9125
查看次数

在 C++ 中写入相同值的竞争条件？

当操作写入单个常量值时，您的代码中存在竞争条件是否有任何问题？例如，如果有一个并行循环seen为另一个数组中的每个值填充一个数组arr（假设没有越界索引的问题）。关键部分可能是以下代码：

//parallel body with index i
int val = arr[i];
seen[val] = true;

Run Code Online (Sandbox Code Playgroud)

由于写入的唯一值是true不需要互斥锁，并且可能对性能有害？即使线程互相踩踏，它们也会用相同的值填充地址，对吗？

c++ parallel-processing thread-safety race-condition

McA*_*gus

2018 09-18

2
推荐指数

1
解决办法

274
查看次数

=运算符是否在C++中调用构造函数/ new？

假设我有一个(不可变的)矩阵类,它在构造函数中动态创建一个数组,并在解构器中删除它.

template <typename T>
class matrix {
private:
    T* data;
public:
    size_t const rows, cols;
    matrix(size_t rows, size_t cols) : rows(rows), cols(cols) {
        data = new T[rows*cols];
    }
    ~matrix() {
        delete [] data;
    }
    //access data
    T& operator()(size_t row, size_t col) {
        return data[row*cols + col];
    }
    matrix<T>& operator=(const matrix<T>& other) {
        //what will this->data contain? do I need to delete anything here?
        //should I call the constructor?
        rows = other.rows;
        cols = other.cols;
        data = new T[rows*cols];
        std::copy(&data[0],&data[0] + …

Run Code Online (Sandbox Code Playgroud)

c++ arrays oop assignment-operator assign

McA*_*gus

2017 03-09

0
推荐指数

1
解决办法

90
查看次数