使用c ++ 11 atomics编写(旋转)线程障碍

Question

使用c ++ 11 atomics编写(旋转)线程障碍

Gri*_*zly 8 c++ multithreading gcc c++11

我正在尝试熟悉c ++ 11原子,所以我尝试为线程编写一个屏障类(在有人抱怨不使用现有类之前:这更多是为了学习/自我改进,而不是由于任何实际需要).我的班级基本上看起来如下:

class barrier
{
private:
    std::atomic<int> counter[2];
    std::atomic<int> lock[2];
    std::atomic<int> cur_idx;
    int thread_count;
public:
    //constructors...
    bool wait();
};

Run Code Online (Sandbox Code Playgroud)

所有成员都初始化为零,除了thread_count,它保存适当的计数.我已经将wait函数实现为

int idx  = cur_idx.load();
if(lock[idx].load() == 0)
{
    lock[idx].store(1);
}
int val = counter[idx].fetch_add(1);
if(val >= thread_count - 1)
{
    counter[idx].store(0);
    cur_idx.fetch_xor(1);
    lock[idx].store(0);
    return true;
}
while(lock[idx].load() == 1);
return false;

Run Code Online (Sandbox Code Playgroud)

然而,当尝试使用它与两个线程(thread_count是2)时,第一个线程进入等待循环就好了,但第二个线程没有解锁屏障(似乎它甚至没有达到int val = counter[idx].fetch_add(1);,但我不是太确定了.但是当我使用gcc atomic-intrinsics volatile int而不是std::atomic<int>和写作wait如下:

int idx = cur_idx;
if(lock[idx] == 0)
{
    __sync_val_compare_and_swap(&lock[idx], 0, 1);
}
int val = __sync_fetch_and_add(&counter[idx], 1);
if(val >= thread_count - 1)
{
    __sync_synchronize();
    counter[idx] = 0;
    cur_idx ^= 1;
    __sync_synchronize();
    lock[idx] = 0;
    __sync_synchronize();
    return true;
}
while(lock[idx] == 1);
return false;

Run Code Online (Sandbox Code Playgroud)

它工作得很好.根据我的理解,两个版本之间不应该有任何根本的区别(如果第二个应该不太可能工作的话,那就更多了).那么以下哪种情况适用？

第二次实现我很幸运,我的算法很糟糕
我没有完全理解std::atomic并且第一个变体存在问题(但不是第二个)
它应该工作,但c ++ 11库的实验性实现并不像我希望的那样成熟

为了记录,我使用32位mingw和gcc 4.6.1

调用代码如下所示:

spin_barrier b(2);
std::thread t([&b]()->void
{
    std::this_thread::sleep_for(std::chrono::duration<double>(0.1));
    b.wait();
});
b.wait();
t.join();

Run Code Online (Sandbox Code Playgroud)

由于mingw没有<thread>头部喷射我使用自编写的版本,它基本上包装了适当的pthread函数(在有人问之前:是的它没有障碍,所以它不应该是包装的问题)任何见解会不胜感激.

编辑:算法说明使其更清晰:

thread_count是等待屏障的线程数(因此如果thread_count线程在屏障中,则所有线程都可以离开屏障).
lock 当第一个(或任何)线程进入屏障时,设置为1.
counter 计算屏障内有多少个线程,并为每个线程以原子方式递增一次
if counter>=thread_count 所有线程都在屏障内,因此计数器和锁定重置为零
否则线程等待lock变为零
在屏障的下一次使用中,使用了不同的变量(counter,lock),确保如果线程仍在等待第一次使用屏障时没有问题(例如,当屏障被抬起时它们已被抢占)

edit2:我现在已经在linux下使用gcc 4.5.1进行了测试,其中两个版本看起来工作得很好,这似乎指出了mingw的问题std::atomic,但我仍然没有完全相信,因为查看<atomic>标题重申了大多数函数只是调用适当的gcc-atomic意味着两个版本之间确实不应该有区别

Answer 1

Ker*_* SB 23

我不知道这是否会有所帮助,但Herb Sutter执行并发队列的以下片段使用了基于原子的自旋锁:

std::atomic<bool> consumerLock;

{   // the critical section
    while (consumerLock.exchange(true)) { }  // this is the spinlock

    // do something useful

    consumerLock = false;  // unlock
}

Run Code Online (Sandbox Code Playgroud)

事实上,该标准为这种结构提供了一种特殊的类型,需要进行无锁操作std::atomic_flag.有了这个,关键部分看起来像这样:

std::atomic_flag consumerLock;

{
    // critical section

    while (consumerLock.test_and_set()) { /* spin */ }

    // do stuff

    consumerLock.clear();
}

Run Code Online (Sandbox Code Playgroud)

(如果您愿意,可以使用获取和释放内存排序.)

哦 - 眼睛更加平和.每平方英寸少于6` - >`和`__`是一种祝福.而且,你也可以拥有空白!我想我会去清理OP (3认同)

Answer 2

chi*_*ill 5

它看起来不必要地复杂化.试试这个更简单的版本(嗯,我还没有测试过,我只是在它上面思考:))):

#include <atomic>

class spinning_barrier
{
public:
    spinning_barrier (unsigned int n) : n_ (n), nwait_ (0), step_(0) {}

    bool wait ()
    {
        unsigned int step = step_.load ();

        if (nwait_.fetch_add (1) == n_ - 1)
        {
            /* OK, last thread to come.  */
            nwait_.store (0); // XXX: maybe can use relaxed ordering here ??
            step_.fetch_add (1);
            return true;
        }
        else
        {
            /* Run in circles and scream like a little girl.  */
            while (step_.load () == step)
                ;
            return false;
        }
    }

protected:
    /* Number of synchronized threads. */
    const unsigned int n_;

    /* Number of threads currently spinning.  */
    std::atomic<unsigned int> nwait_;

    /* Number of barrier syncronizations completed so far, 
     * it's OK to wrap.  */
    std::atomic<unsigned int> step_;
};

Run Code Online (Sandbox Code Playgroud)

编辑: @Grizzy,我在你的第一个(C++ 11)版本中找不到任何错误,我也运行它,因为有两个线程的一亿次同步,它就完成了.我在双插槽/四核GNU/Linux机器上运行它,所以我更倾向于怀疑你的选项3. - 库(或者更确切地说,它的端口到win32)还不够成熟.

Answer 3

小智 5

以下是C++ Concurrency in Action:Practical Multithreading中的优雅解决方案.

struct bar_t {
    unsigned const count;
    std::atomic<unsigned> spaces;
    std::atomic<unsigned> generation;
    bar_t(unsigned count_) :
        count(count_), spaces(count_), generation(0)
    {}
    void wait() {
        unsigned const my_generation = generation;
        if (!--spaces) {
            spaces = count;
            ++generation;
        } else {
            while(generation == my_generation);
        }
    }
};

Run Code Online (Sandbox Code Playgroud)

是好.对于想要更多信息的人来说,这是隐藏在第267页左右,因为屏障没有出现在我的Manning 2012副本的索引中. (2认同)

归档时间：	13 年，11 月前
查看次数：	16010 次
最近记录：	9 年，5 月前