C/C++ 中嵌套结构的尾随填充 - 有必要吗？

Question

C/C++ 中嵌套结构的尾随填充 - 有必要吗？

Dom*_*324 22 c c++ padding compiler-optimization structlayout

这更多的是一个理论问题。我熟悉填充和尾随填充的工作原理。

struct myStruct{
    uint32_t x;
    char*    p;
    char     c;
};

// myStruct layout will compile to
// x:       4 Bytes
// padding: 4 Bytes
// *p:      8 Bytes
// c:       1 Byte
// padding: 7 Bytes
// Total:   24 Bytes

Run Code Online (Sandbox Code Playgroud)

之后需要有填充x，以便*p对齐，并且之后需要有尾部填充c，以便整个结构体大小可以被 8 整除（为了获得正确的步幅长度）。但考虑这个例子：

struct A{
    uint64_t x;
    uint8_t  y;
};

struct B{
    struct A myStruct;
    uint32_t c;
};

// Based on all information I read on internet, and based on my tinkering
// with both GCC and Clang, the layout of struct B will look like:
// myStruct.x:       8 Bytes
// myStruct.y:       1 Byte
// myStruct.padding: 7 Bytes
// c:                4 Bytes
// padding:          4 Bytes
// total size:       24 Bytes
// total padding:    11 Bytes
// padding overhead: 45%

// my question is, why struct A does not get "inlined" into struct B,
// and therefore why the final layout of struct B does not look like this:
// myStruct.x:       8 Bytes
// myStruct.y:       1 Byte
// padding           3 Bytes
// c:                4 Bytes
// total size:       16 Bytes
// total padding:    3 Bytes
// padding overhead: 19%

Run Code Online (Sandbox Code Playgroud)

两种布局都满足所有变量的对齐。两种布局具有相同的变量顺序。两种布局都有struct B正确的步长（可被 8 字节整除）。唯一不同的是（除了尺寸小 33% 之外），struct A布局 2 中没有正确的步幅长度，但这应该不重要，因为显然没有struct As 数组。

我用 -O3 和 -g 在 GCC 中检查了这个布局，struct B有 24 字节。

我的问题是 - 是否有某种原因导致不应用此优化？C/C++ 中是否有某些布局要求禁止这样做？或者我缺少一些编译标志吗？或者这是 ABI 的事情吗？

编辑：已回答。

请参阅 @dbush 的答案，了解为什么编译器无法自行发出此布局。
以下代码示例使用 GCC pragmaspacked和aligned（如 @jaskij 建议）手动强制执行更优化的布局。StructB_packed只有 16 字节而不是 24 字节（请注意，当存在结构数组时，此代码可能会导致问题/运行缓慢B_packed，请注意，不要盲目复制此代码）：

struct __attribute__ ((__packed__)) A_packed{
    uint64_t x;
    uint8_t  y;
};

struct __attribute__ ((__packed__)) B_packed{
    struct A_packed myStruct;
    uint32_t c __attribute__ ((aligned(4)));
};

// Layout of B_packed will be
// myStruct.x:       8 Bytes
// myStruct.y:       1 Byte
// padding for c:    3 Bytes
// c:                4 Bytes
// total size:       16 Bytes
// total padding:    3 Bytes
// padding overhead: 19%

Run Code Online (Sandbox Code Playgroud)

Answer 1

dbu*_*ush 26

是否有某种原因导致不应用此优化

如果允许这样做，的值sizeof(struct B)将是不明确的。

假设你这样做了：

struct B b;
struct A a = { 1, 2 };
b.c = 0x12345678;
memcpy(&b.myStruct, &a, sizeof(struct A));

Run Code Online (Sandbox Code Playgroud)

你会覆盖的值b.c。

编译器无法证明这不会发生。了解[停止问题](https://en.wikipedia.org/wiki/Halting_problem)。 (7认同)
实际上，我建议直接阅读[赖斯定理](https://en.wikipedia.org/wiki/Rice%27s_theorem)，而不是停止问题。编译器_可能_有时决定以不同的方式布局结构，或者根本不将其放入内存中，例如[如果没有手动内存管理，并且它的寿命非常短暂，并且根本没有“结构”]（https：/ /godbolt.org/z/Gxs5crjbG）。然而，由于“假设”规则，人们无法以任何方式从程序内部检测到这一点。 (6认同)
@yeputons：您所说的优化是聚合标量替换的情况，简称 SROA 或 SRA。它并没有真正改变结构布局，它完全优化了结构并只与成员一起工作。GCC 甚至可以在函数之间执行此操作，如 2010 年论文 [新的过程内标量替换聚合](https://gcc.gnu.org/wiki/summit2010?action=AttachFile&do=get&target=jambor.pdf) 中所述，例如`-fipa-sra` 优化选项。（只是随机搜索“聚合的标量替换”）。Java 和 Javascript JIT 也可以做到这一点。 (6认同)

归档时间：	2 年，10 月前
查看次数：	1648 次
最近记录：	2 年，10 月前