小编Jun*_*aid的帖子

需要关于Linux上C语言中多线程分析的思考

我的应用场景是这样的:我想评估一个可以在四核机器上实现的性能增益,以处理相同数量的数据.我有以下两种配置:

i)1-Process:没有任何线程的程序,处理来自1M .. 1G的数据,而系统假设只运行其4核的单核.

ii)4线程 - 进程:具有4个线程的程序(所有线程执行相同的操作)但处理25%的输入数据.

在我创建4线程的程序中,我使用了pthread的默认选项(即没有任何特定的pthread_attr_t).我认为,与1-Process配置相比,4线程配置的性能提升应该接近400%(或介于350%和400%之间).

我分析了创建线程所花费的时间,如下所示:

timer_start(&threadCreationTimer); 
pthread_create( &thread0, NULL, fun0, NULL );
pthread_create( &thread1, NULL, fun1, NULL );
pthread_create( &thread2, NULL, fun2, NULL );
pthread_create( &thread3, NULL, fun3, NULL );
threadCreationTime = timer_stop(&threadCreationTimer);

pthread_join(&thread0, NULL);
pthread_join(&thread1, NULL);
pthread_join(&thread2, NULL);
pthread_join(&thread3, NULL);

Run Code Online (Sandbox Code Playgroud)

由于输入数据大小的增加也可能在每个线程的内存需求中增加,因此预先加载所有数据绝对不是一个可行的选择.因此,为了确保不增加每个线程的内存需求,每个线程以小块的形式读取数据,处理它并读取下一个块处理它等等.因此,由线程运行的函数代码的结构如下:

timer_start(&threadTimer[i]);
while(!dataFinished[i])
{
    threadTime[i] += timer_stop(&threadTimer[i]);
    data_source();
    timer_start(&threadTimer[i]);
    process();
}
threadTime[i] += timer_stop(&threadTimer[i]);

Run Code Online (Sandbox Code Playgroud)

变量在收到并处理所有需要的数据时由进程dataFinished[i]标记true.Process()知道什么时候这样做:-)

在main函数中,我正在计算4线程配置所需的时间,如下所示:

execTime4Thread = max(threadTime[0], threadTime[1], threadTime[2], threadTime[3]) + threadCreationTime.

并且简单地计算性能增益

gain = execTime1process …

linux performance multithreading multicore pthreads

Jun*_*aid

2011 12-14

7
推荐指数

1
解决办法

568
查看次数