이영보*_*이영보 7 c iphone assembly arm inline-assembly
我正在研究内联装配.我想在Xcode 4 LLVM 3.0 Compiler下用iPhone编写一个简单的例程.我成功编写了基本的内联汇编代码.
例如:
int sub(int a, int b)
{
int c;
asm ("sub %0, %1, %2" : "=r" (c) : "r" (a), "r" (b));
return c;
}
Run Code Online (Sandbox Code Playgroud)
我在stackoverflow.com找到它并且它工作得很好.但是,我不知道如何编写有关LOOP的代码.
我需要汇编代码
void brighten(unsigned char* src, unsigned char* dst, int numPixels, int intensity)
{
for(int i=0; i<numPixels; i++)
{
dst[i] = src[i] + intensity;
}
}
Run Code Online (Sandbox Code Playgroud)
请看一下循环部分 - http://en.wikipedia.org/wiki/ARM_architecture
基本上你会想要这样的东西:
void brighten(unsigned char* src, unsigned char* dst, int numPixels, int intensity) {
asm volatile (
"\t mov r3, #0\n"
"Lloop:\n"
"\t cmp r3, %2\n"
"\t bge Lend\n"
"\t ldrb r4, [%0, r3]\n"
"\t add r4, r4, %3\n"
"\t strb r4, [%1, r3]\n"
"\t add r3, r3, #1\n"
"\t b Lloop\n"
"Lend:\n"
: "=r"(src), "=r"(dst), "=r"(numPixels), "=r"(intensity)
: "0"(src), "1"(dst), "2"(numPixels), "3"(intensity)
: "cc", "r3", "r4");
}
Run Code Online (Sandbox Code Playgroud)
更新:
这是NEON版本:
void brighten_neon(unsigned char* src, unsigned char* dst, int numPixels, int intensity) {
asm volatile (
"\t mov r4, #0\n"
"\t vdup.8 d1, %3\n"
"Lloop2:\n"
"\t cmp r4, %2\n"
"\t bge Lend2\n"
"\t vld1.8 d0, [%0]!\n"
"\t vqadd.s8 d0, d0, d1\n"
"\t vst1.8 d0, [%1]!\n"
"\t add r4, r4, #8\n"
"\t b Lloop2\n"
"Lend2:\n"
: "=r"(src), "=r"(dst), "=r"(numPixels), "=r"(intensity)
: "0"(src), "1"(dst), "2"(numPixels), "3"(intensity)
: "cc", "r4", "d1", "d0");
}
Run Code Online (Sandbox Code Playgroud)
所以这个NEON版本一次会做8个.然而它没有检查numPixels是否可被8整除,所以你肯定想要这样做,否则事情就会出错!无论如何,这只是向您展示可以做什么的开始.注意相同数量的指令,但同时对8个像素的数据执行操作.哦,它也在那里饱和,我认为你会想要.