相关疑难解决方法(0)

CUDA矢量类型的效率(float2,float3,float4)

我试图从CUDA例子中理解integrate_functorin particles_kernel.cu:

struct integrate_functor
{
    float deltaTime;    
    //constructor for functor
    //...

    template <typename Tuple>
    __device__
    void operator()(Tuple t)
    {
        volatile float4 posData = thrust::get<2>(t);
        volatile float4 velData = thrust::get<3>(t);

        float3 pos = make_float3(posData.x, posData.y, posData.z);
        float3 vel = make_float3(velData.x, velData.y, velData.z);

        // update position and velocity
        // ...

        // store new position and velocity
        thrust::get<0>(t) = make_float4(pos, posData.w);
        thrust::get<1>(t) = make_float4(vel, velData.w);
    }
};
Run Code Online (Sandbox Code Playgroud)

我们打电话make_float4(pos, age)但是make_float4被定义vector_functions.h

static __inline__ __host__ __device__ float4 …
Run Code Online (Sandbox Code Playgroud)

c cuda thrust

14
推荐指数
1
解决办法
2万
查看次数

标签 统计

c ×1

cuda ×1

thrust ×1