小编rxu*_*rxu的帖子

将Makefile中的长依赖项分成几行

target: TargetA ../DirB/FileB.cpp ../DirC/FileC.o ../DirD/FileD.o ...

这是make文件中的一个长行.有可能把它分成几行吗？

makefile

rxu*_*rxu

lucky-day

5
推荐指数

2
解决办法

6303
查看次数

查找下一个对齐的内存地址

根据我对维基百科的理解，我可以通过以下按位运算找到具有正确对齐方式的下一个最接近元素的索引。

Assuming the address of the 1st element has the correct alignment.
Assuming the index_alignment is a power of 2.
new_index = (current_index + index_alignment - 1) & ~(index_alignment - 1).
new_address = address_of_1st_element + new_index
index_alignment is 16 bytes/sizeof(type of element) for SSE.

Run Code Online (Sandbox Code Playgroud)

是否可以直接在地址上使用它来从任何给定地址找到下一个最接近的对齐地址？（这样比较快吗？）

为了快速做到这一点，我正在考虑以下事项。

new_address = (current_address + alignment - 1) & ~(alignment -1)
alignment here is 16 for SSE.

Run Code Online (Sandbox Code Playgroud)

当我实现这个时，我发现以下代码无法编译...

根据 Salva 和 Rotem 的建议修复了代码

#include <iostream>
#include <stdint.h>
#define ALIGNMENT 16
using namespace std; …

Run Code Online (Sandbox Code Playgroud)

c++ bit-manipulation memory-alignment

rxu*_*rxu

2016 07-15

5
推荐指数

1
解决办法

3520
查看次数

此代码中&&的优点是什么？

在以下代码中，使用的好处是&&什么？该代码来自Specialize相同运算符的不同特征的答案

从这个问题，我得到一个&&参数意味着它是一个可以被函数修改的引用。

的decay_t可能阻止编译器解释为阵列的变量的引用，如在什么是标准::衰变和时，应使用什么？

std::forward是完美的描述转发在这里。为什么我们需要这种转发？

谢谢。

#include <iostream>
#include <type_traits>
#include<utility>

class A;

template <typename T>
struct is_A : std::false_type {};
template <> struct is_A<A> : std::true_type {};

template <typename T>
struct is_int : std::false_type {};
template <> struct is_int<int> : std::true_type {};
template <> struct is_int<long> : std::true_type {};

class A{
public:
    int val;

    void print(void){
        std::cout << val << std::endl;
    }

    template <typename T1>
    std::enable_if_t<is_int<std::decay_t<T1>>::value, …

Run Code Online (Sandbox Code Playgroud)

c++ universal-reference c++14

rxu*_*rxu

2017 05-23

5
推荐指数

1
解决办法

313
查看次数

numpy如何实现多维广播？

内存(行主要顺序):

[[A(0,0), A(0,1)]
 [A(1,0), A(1,1)]]

has this memory layout: 
[A(0,0), A(0,1), A(1,0), A(1,1)]

Run Code Online (Sandbox Code Playgroud)

我想在下列情况下算法的工作方式如此.

广播维度是最后一个维度:

[[0, 1, 2, 3]         [[1]
                  x
 [4, 5, 6, 7]]         [10]]

   A (2 by 4)            B (2 by 1)

Iterate 0th dimensions of A and B simultaneously {
    Iterate last dimension of A{
        multiply;
    } 
}

Run Code Online (Sandbox Code Playgroud)

广播维度为第0维:

[[0, 1, 2, 3]   
                  x    [[1,10,100,1000]]
 [4, 5, 6, 7]]

   A (2 by 4)              B (1 by 4)

Iterate 0th dimension of A{
    Iterate 1st dimensions of A …

Run Code Online (Sandbox Code Playgroud)

c python numpy

rxu*_*rxu

2016 09-23

5
推荐指数

1
解决办法

719
查看次数

numpy内部存储数组的大小吗？

从这里的numpy数组的规范:

typedef struct PyArrayObject {
    PyObject_HEAD
    char *data;
    int nd;
    npy_intp *dimensions;
    npy_intp *strides;
    PyObject *base;
    PyArray_Descr *descr;
    int flags;
    PyObject *weakreflist;
} PyArrayObject;

Run Code Online (Sandbox Code Playgroud)

当我查看numpy数组的规范时,我没有看到它存储数组的元素数量.那是真的吗？

不存储的优点是什么？

谢谢.

c python numpy

rxu*_*rxu

lucky-day

3
推荐指数

1
解决办法

122
查看次数

转换位数组以更快地设置

输入是存储在连续存储器中的比特阵列,每1比特存储器具有1比特的比特阵列.

输出是比特阵列的设定位索引的数组.

例:

bitarray: 0000 1111 0101 1010
setA: {4,5,6,7,9,11,12,14}
setB: {2,4,5,7,9,10,11,12}

Run Code Online (Sandbox Code Playgroud)

获得A组或B组都可以.该集存储为uint32_t数组,因此该集的每个元素都是数组中的无符号32位整数.

如何在单个cpu核心上快5倍左右？

当前代码:

#include <iostream>
#include <vector>
#include <time.h>

using namespace std;

template <typename T>
uint32_t bitarray2set(T& v, uint32_t * ptr_set){
    uint32_t i;
    uint32_t base = 0;
    uint32_t * ptr_set_new = ptr_set;
    uint32_t size = v.capacity();
    for(i = 0; i < size; i++){
        find_set_bit(v[i], ptr_set_new, base);
        base += 8*sizeof(uint32_t);
    }
    return (ptr_set_new - ptr_set);
}

inline void find_set_bit(uint32_t n, uint32_t*& ptr_set, uint32_t base){
    // Find the set bits …

Run Code Online (Sandbox Code Playgroud)

c++ sse bit-manipulation set bitarray

rxu*_*rxu

2017 05-23

2
推荐指数

1
解决办法

196
查看次数

有效地同时获得numpy.argmin和numpy.amin

是否可以通过一次调用numpy获得numpy.argmin和numpy.amin的结果？谢谢.

python performance numpy

rxu*_*rxu

2016 05-20

1
推荐指数

1
解决办法

974
查看次数

可以基于循环工作范围进行分配吗？

是否可以使用/实现一个远程基本循环来为数组分配数字？

我想要的是:

for (auto i : X){
    i = 1;
} //I want this to fill the array with 1.

Run Code Online (Sandbox Code Playgroud)

c++ for-loop range

rxu*_*rxu

2016 09-20

-4
推荐指数

1
解决办法

461
查看次数

C++避免编写两个类似的函数

我希望通过使用模板或其他方法隐藏或不隐藏函数的中间和附近的几行代码来使两个版本成为函数.怎么做？

这些功能对性能至关重要.他们运行了数十亿次.

c++ templates

rxu*_*rxu

2016 07-09

-7
推荐指数

1
解决办法

104
查看次数

标签统计

c++ ×5

numpy ×3

python ×3

bit-manipulation ×2

c ×2

bitarray ×1

c++14 ×1

for-loop ×1

makefile ×1

memory-alignment ×1

performance ×1

range ×1

set ×1

sse ×1

templates ×1

universal-reference ×1

标签 统计

小编rxu_rxu的帖子

标签统计