小编Ale*_*iev的帖子

为什么编译器在这里错过矢量化？

考虑以下valarray类：

#include <stdlib.h>

struct va
{
    void add1(const va& other);
    void add2(const va& other);

    size_t* data;
    size_t  size;
};

void va::add1(const va& other) {
    for (size_t i = 0; i < size; ++i) {
        data[i] += other.data[i];
    }
}

void va::add2(const va& other){
    for (size_t i = 0, s = size; i < s; ++i) {
        data[i] += other.data[i];
    }
}

Run Code Online (Sandbox Code Playgroud)

该add2函数针对不同的编译器（MSVC、Clang、GCC、ICC）进行了向量化，而add1并非如此。参见https://godbolt.org/z/c61qvrrbv

这是通过潜在的别名来解释的：编译器无法证明所指向的元素之一data不是其size本身。

data然而，和指向的元素也可能存在重叠other.data。对于 MSVC，这些元素和指针本身可能存在别名，因为它没有利用严格别名规则。这适用于add1 …

c++ vectorization strict-aliasing compiler-optimization auto-vectorization

Ale*_*iev

2023 08-18

14
推荐指数

1
解决办法

456
查看次数

为什么 std::mutex 是标准布局类？

[thread.mutex.class]/3 :

[...]它是一个标准布局类（[class.prop]）。

提出这个要求的原因是什么？

c++ language-lawyer stdmutex

Ale*_*iev

2021 12-04

13
推荐指数

1
解决办法

414
查看次数

为什么 Visual C++ 中的 std::mutex 比 std::shared_mutex 差这么多？

在 Visual Studio 2022 中以发布模式运行以下命令：

#include <chrono>
#include <mutex>
#include <shared_mutex>
#include <iostream>

std::mutex mx;
std::shared_mutex smx;

constexpr int N = 100'000'000;

int main()
{
    auto t1 = std::chrono::steady_clock::now();
    for (int i = 0; i != N; i++)
    {
        std::unique_lock<std::mutex> l{ mx };
    }
    auto t2 = std::chrono::steady_clock::now();
    for (int i = 0; i != N; i++)
    {
        std::unique_lock<std::shared_mutex> l{ smx };
    }
    auto t3 = std::chrono::steady_clock::now();

    auto d1 = std::chrono::duration_cast<std::chrono::duration<double>>(t2 - t1);
    auto d2 = std::chrono::duration_cast<std::chrono::duration<double>>(t3 - t2);

    std::cout …

Run Code Online (Sandbox Code Playgroud)

c++ visual-c++ stdmutex

Ale*_*iev

2021 11-16

13
推荐指数

1
解决办法

2354
查看次数

std::atomic 和 std::condition_variable 等待、notify_* 方法的区别

我正在浏览“原子操作库”，并遇到了原子“等待”和“notify_ ”方法的新 C++20 功能。我很好奇 std::condition_variable 的 'wait' 和 'notify_ ' 方法有何不同。

c++ condition-variable stdatomic c++20

PVR*_*VRT

2020 07-13

12
推荐指数

2
解决办法

952
查看次数

libcxx std::counting_semaphore 如何为发布/获取实现“强烈发生在之前”？

libc++在方法中std::counting_semaphore使用原子增量：memory_order_releaserelease

    void release(ptrdiff_t __update = 1)
    {
        if(0 < __a.fetch_add(__update, memory_order_release))
            ;
        else if(__update > 1)
            __a.notify_all();
        else
            __a.notify_one();
    }

Run Code Online (Sandbox Code Playgroud)

并memory_order_acquire在acquire方法中将交换与成功的内存顺序进行比较：

    void acquire()
    {
        auto const __test_fn = [=]() -> bool {
            auto __old = __a.load(memory_order_relaxed);
            return (__old != 0) && __a.compare_exchange_strong(__old, __old - 1, memory_order_acquire, memory_order_relaxed);
        };
        __cxx_atomic_wait(&__a.__a_, __test_fn);
    }

Run Code Online (Sandbox Code Playgroud)

使获取与发布同步的明显选择。

但是，C++20 草案说：

void release(ptrdiff_t update = 1);
Run Code Online (Sandbox Code Playgroud)
...

同步：强烈发生在调用 try_acquire 之前，观察效果的结果。

强发生在比同步之前 …

c++ memory-barriers language-lawyer stdatomic c++20

Ale*_*iev

2021 02-17

8
推荐指数

0
解决办法

441
查看次数

为什么 C++23 stacktrace_entry 与 source_location 不同？

class stacktrace_entry {
public:
  string description() const;
  string source_file() const;
  uint_least32_t source_line() const;
  /* ... */
};

Run Code Online (Sandbox Code Playgroud)

struct source_location {
  // source location field access
  constexpr uint_least32_t line() const noexcept;
  constexpr uint_least32_t column() const noexcept;
  constexpr const char* file_name() const noexcept;
  constexpr const char* function_name() const noexcept;
  /* ... */
};

Run Code Online (Sandbox Code Playgroud)

它们的目的基本相同，为什么它们有差异，特别是中没有列stacktrace_entry，或者甚至不共享同一类？

c++ stack-trace c++23

Ale*_*iev

lucky-day

8
推荐指数

1
解决办法

534
查看次数

如何将 Intel TSX 与 C++ 内存模型一起使用？

我认为 C++ 还没有涵盖任何类型的事务内存，但 TSX 仍然可以以某种方式将“好像规则”用于由 C++ 内存模型管理的东西。

那么，成功的 HLE 操作或成功的 RTM 事务会发生什么？

说“存在数据竞争，但没关系”并没有多大帮助，因为它没有阐明“正常”的含义。

使用 HLE 可能可以将其视为“前一个操作发生在后续操作之前。好像该部分仍然由被省略的锁保护”。

RTM 是什么？由于甚至没有省略锁，只有（可能是非原子的）内存操作，可能是加载、存储、两者或无操作。什么与什么同步？在什么之前会发生什么？

c++ memory-model language-lawyer intel-tsx

Ale*_*iev

2020 04-21

7
推荐指数

1
解决办法

599
查看次数

运营商是否需要nodiscard？

该[[nodiscard]]属性对于操作员来说是必需的吗？或者可以安全地假设编译器会发出警告，就像它对大多数可疑丢弃的东西所做的那样？

例如，一个重载的operator+，应该应用该属性吗？函数转换运算符或新运算符等特殊运算符又如何呢？什么时候迂腐了？

c++ c++17 nodiscard

j5w*_*j5w

2021 09-19

7
推荐指数

1
解决办法

3691
查看次数

预处理器是否在`operator""_name`中定义替换

考虑 Aykhan Hagverdili 提供的以下示例：

#include <string>

using std::operator""s;

#define s foobar

auto s = "hello world"s;

Run Code Online (Sandbox Code Playgroud)

有些编译器会替换s并导致编译失败。有些编译器不会替代s.

请参阅此处的结果： https: //godbolt.org/z/jx4nhYczd gcc 失败，clang 编译

哪个是对的？

c++ language-lawyer user-defined-literals c-preprocessor

Ale*_*iev

2023 08-03

7
推荐指数

1
解决办法

389
查看次数

[[nodiscard]] 到函数指针

我想使用第三方函数，它通过充满函数指针的结构提供其 API。例如：

struct S {
    using p_func1 = int(*)(int, int);
    p_func1 func1;
    using p_func2 = int(*)(char*);
    p_func2 func2;
}

Run Code Online (Sandbox Code Playgroud)

第三方库初始化该结构。需要检查这些函数（func1、func2）的返回值，我希望能够以某种方式在属性上体现出来，[[discard]]以确保返回值得到检查。

有什么办法可以做到这一点，同时保持结构的 ABI？

编辑：到目前为止，我能想到的最好的办法就是拥有另一个结构，如下所示：

struct S_wrap {
    S orig;
    [[nodiscard]] int func1(int a, int b){ return orig.func1(a, b); }
    [[nodiscard]] int func2(char* a){ return orig.func2(a); }
}

Run Code Online (Sandbox Code Playgroud)

我希望有更好的东西

c++ c++17 nodiscard

Cur*_*519

2021 09-19

6
推荐指数

1
解决办法

547
查看次数