在尝试使用内在函数和汇编来回答嵌入式广播时,我试图做这样的事情:
__m512 mul_broad(__m512 a, float b) {
int scratch = 0;
asm(
"vbroadcastss %k[scalar], %q[scalar]\n\t" // want vbr.. %xmm0, %zmm0
"vmulps %q[scalar], %[vec], %[vec]\n\t"
// how it's done for integer registers
"movw symbol(%q[inttmp]), %w[inttmp]\n\t" // movw symbol(%rax), %ax
"movsbl %h[inttmp], %k[inttmp]\n\t" // movsx %ah, %eax
: [vec] "+x" (a), [scalar] "+x" (b), [inttmp] "=r" (scratch)
:
:
);
return a;
}
Run Code Online (Sandbox Code Playgroud)
的GNU C 86操作数修饰符文档仅指定到改性剂q(DI(DoubleInt)尺寸,64位).使用q一个向量寄存器总会带给它归结为xmm(从ymm或zmm). …
在使用GCC的内联汇编程序功能时,我尝试创建一个立即退出该过程的函数,类似于_ExitC标准库.
以下是相关的源代码:
void immediate_exit(int code)
{
#if defined(__x86_64__)
asm (
//Load exit code into %rdi
"mov %0, %%rdi\n\t"
//Load system call number (group_exit)
"mov $231, %%rax\n\t"
//Linux syscall, 64-bit version.
"syscall\n\t"
//No output operands, single unrestricted input register, no clobbered registers because we're about to exit.
:: "" (code) :
);
//Skip other architectures here, I'll fix these later.
#else
# error "Architecture not supported."
#endif
}
Run Code Online (Sandbox Code Playgroud)
这适用于调试版本(带-O0),但只要我在任何级别打开优化,我都会收到以下错误:
immediate_exit.c: Assembler messages:
immediate_exit.c:4: Error: unsupported for `mov' …Run Code Online (Sandbox Code Playgroud)