无法理解C和类型转换中的指针

Question

无法理解C和类型转换中的指针

我无法理解为什么第3和第4 printf分给54和-61.根据我的说法,程序应该给出0作为输出,因为字符指针预计会显示输出值达(sizeof(char) * 8)到位,而二进制是54 00000000 00110110.

#include<stdio.h>
void main()
{
      int i=54;
      float a=3.14;
      char *ii,*aa;

      ii=(char *)&i;
      aa=(char *)&a;

      printf("%u\n",ii);
      printf("%u\n",aa);
      printf("%d\n",*ii);
      printf("%d\n",*aa);

}

Run Code Online (Sandbox Code Playgroud)

编辑:第四个printf(如果我%f在那里,我%d错误地输入)正在给予0.00000.为什么？

Answer 1

Tha*_*tos 5

为什么第三个输出54？

您的第三个输出显示54,因为在您的机器上,

int i=54;

Run Code Online (Sandbox Code Playgroud)

存储在内存中,如下所示:

36 00 00 00

Run Code Online (Sandbox Code Playgroud)

你的指针点在这里:

36 00 00 00
^^

Run Code Online (Sandbox Code Playgroud)

因此,当您将0x36打印为char(一个字节长的整数类型)时,您会看到54.

这种存储格式称为" 小端 ",用于x86和amd64处理器,这很常见.

请注意,该语言不保证以这种方式存储整数; 您可能会使用不同的计算机或编译器获得不同的结果.不要依赖它.

漂浮怎么样？

该float作品同样,但更复杂的显示.同样,它依赖于机器.对于amd64,如果你编码3.14IEEE单(这是平台相关的),然后向后存储四个字节(至少,我相信amd64存储它们"小端",虽然我不知道为什么,因为它是一个float.¹),第一个插槽中的字节值,当被视为带符号的8位二进制补码整数(这也取决于平台)时,应该达到您所看到的值.

最后,你说:

我不知道小edian.但不是浮动.如果我在第四次使用%f代替%d(这是错误的我在这里键入%d),它给出0.000000000

我会假设你的意思是:

printf("%f\n",*aa);

Run Code Online (Sandbox Code Playgroud)

这aa还是一个char *.这个格式不正确:因为%f,你需要传递一个double或一个float.但是,让我们继续,并尝试解释这个(未定义!)行为.

因为它是a char *,当你取消引用它时,在你的机器上,它可能会读取一些单字节值.3.14,作为一个小的endian浮点数,是:

c3 f5 48 40
^^

Run Code Online (Sandbox Code Playgroud)

0xc3,作为一个二进制补码签名的一个字节整数,是-61,这解释了你的问题.因此,对于您的程序*aa是-61.当你传递给printf它时,它将被提升为一个int,因为它printf是一个"varargs"(可变数量的参数)函数.在某些编译器中编译时可以看到这一点:

prog1.c:14:7:警告:格式'%f'需要类型为'double'的参数,但参数2的类型为'int'[-Wformat]

因此,"int"将以printf您的平台使用的任何方式传递给它.我们来研究一下.为了明确,我正在编译以下内容:

#include<stdio.h>
int main()
{
    int i=54;
    float a=3.14;
    char *ii,*aa;

    ii=(char *)&i;
    aa=(char *)&a;

    printf("%u\n",ii);
    printf("%u\n",aa);
    printf("%d\n",*ii);
    printf("%f\n",*aa);

    return 0;
}

Run Code Online (Sandbox Code Playgroud)

我做:

% gcc -g -o prog1 prog1.c
prog1.c: In function ‘main’:
prog1.c:11:2: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 2 has type ‘char *’ [-Wformat]
prog1.c:12:2: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 2 has type ‘char *’ [-Wformat]
prog1.c:14:2: warning: format ‘%f’ expects argument of type ‘double’, but argument 2 has type ‘int’ [-Wformat]

Run Code Online (Sandbox Code Playgroud)

(如果不清楚:gcc在这里抛出非常好的警告:它指出你的程序中未定义的行为 - 错误.你应该总是修复这些.我们将忽略它们进行调查,但请注意编译器可以在这一点上真的做任何想做的事,所以下面的一切都是保证.)

然后,让我们开始这是一个调试器,并停止在最后一个printf上.对我来说,这是第14行.因此:

% gdb prog1
GNU gdb (Gentoo 7.6.2 p1) 7.6.2
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.gentoo.org/>...
Reading symbols from /home/me/code/random/prog1...done.
(gdb) break prog1.c:14
Breakpoint 1 at 0x4005db: file prog1.c, line 14.

Run Code Online (Sandbox Code Playgroud)

让我们把它运行到那个断点.

(gdb) r
Starting program: /home/me/code/random/prog1 
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
4294959628
4294959624
54

Breakpoint 1, main () at prog1.c:14
14      printf("%f\n",*aa);

Run Code Online (Sandbox Code Playgroud)

现在我们停在了" printf",但这意味着什么？我们来看看一些汇编程序!

(gdb) disassemble
Dump of assembler code for function main:
   0x000000000040056c <+0>: push   %rbp
   0x000000000040056d <+1>: mov    %rsp,%rbp
   0x0000000000400570 <+4>: sub    $0x20,%rsp
   0x0000000000400574 <+8>: movl   $0x36,-0x14(%rbp)
   0x000000000040057b <+15>:    mov    0x12f(%rip),%eax        # 0x4006b0
   0x0000000000400581 <+21>:    mov    %eax,-0x18(%rbp)
   0x0000000000400584 <+24>:    lea    -0x14(%rbp),%rax
   0x0000000000400588 <+28>:    mov    %rax,-0x8(%rbp)
   0x000000000040058c <+32>:    lea    -0x18(%rbp),%rax
   0x0000000000400590 <+36>:    mov    %rax,-0x10(%rbp)
   0x0000000000400594 <+40>:    mov    -0x8(%rbp),%rax
   0x0000000000400598 <+44>:    mov    %rax,%rsi
   0x000000000040059b <+47>:    mov    $0x4006a4,%edi
   0x00000000004005a0 <+52>:    mov    $0x0,%eax
   0x00000000004005a5 <+57>:    callq  0x400450 <printf@plt>
   0x00000000004005aa <+62>:    mov    -0x10(%rbp),%rax
   0x00000000004005ae <+66>:    mov    %rax,%rsi
   0x00000000004005b1 <+69>:    mov    $0x4006a4,%edi
   0x00000000004005b6 <+74>:    mov    $0x0,%eax
   0x00000000004005bb <+79>:    callq  0x400450 <printf@plt>
   0x00000000004005c0 <+84>:    mov    -0x8(%rbp),%rax
   0x00000000004005c4 <+88>:    movzbl (%rax),%eax
   0x00000000004005c7 <+91>:    movsbl %al,%eax
   0x00000000004005ca <+94>:    mov    %eax,%esi
   0x00000000004005cc <+96>:    mov    $0x4006a8,%edi
   0x00000000004005d1 <+101>:   mov    $0x0,%eax
   0x00000000004005d6 <+106>:   callq  0x400450 <printf@plt>
=> 0x00000000004005db <+111>:   mov    -0x10(%rbp),%rax
   0x00000000004005df <+115>:   movzbl (%rax),%eax
   0x00000000004005e2 <+118>:   movsbl %al,%eax
   0x00000000004005e5 <+121>:   mov    %eax,%esi
   0x00000000004005e7 <+123>:   mov    $0x4006ac,%edi
   0x00000000004005ec <+128>:   mov    $0x0,%eax
   0x00000000004005f1 <+133>:   callq  0x400450 <printf@plt>
   0x00000000004005f6 <+138>:   mov    $0x0,%eax
   0x00000000004005fb <+143>:   leaveq 
   0x00000000004005fc <+144>:   retq

Run Code Online (Sandbox Code Playgroud)

那就是main,箭头(=>)就在我们身边.当前的call指令0x00000000004005f1是对第四个的调用printf,正如您所看到的,调用它需要一些设置:所有这些mov指令.由于他们设置了调用,而我们感兴趣的是传递给printf我们的内容,我们需要让它们运行,所以我们需要将程序直接放到该call指令处.我们可以用另一个断点来做到这一点:

(gdb) break *0x00000000004005f1
Breakpoint 2 at 0x4005f1: file prog1.c, line 14.
(gdb) continue
Continuing.

Breakpoint 2, 0x00000000004005f1 in main () at prog1.c:14
14      printf("%f\n",*aa);

Run Code Online (Sandbox Code Playgroud)

现在我们正在call发表声明.现在,因为我使用的是amd64芯片(Intel Core i7.有时也称为x86-64.)而且我没有运行Windows,对我来说,我们通过将参数从左边调到左边来调用函数对,某些寄存器.从右边开始,第一个参数是*aa,记住,我们已经建立了-61.我们可以转储寄存器:

(gdb) info all-registers
rax            0x0  0
rbx            0x0  0
rcx            0x2  2
rdx            0x7ffff7dd7820   140737351874592
rsi            0xffffffc3   4294967235
rdi            0x4006ac 4196012
rbp            0x7fffffffe220   0x7fffffffe220
rsp            0x7fffffffe1f8   0x7fffffffe1f8
r8             0x2  2
r9             0x7ffff7dd4640   140737351861824
r10            0x7fffffffe0d8   140737488347352
r11            0x246    582
r12            0x400480 4195456
r13            0x7fffffffe300   140737488347904

[ snip … ]

ymm0           {v8_float = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, v4_double = {0x0, 0x0, 0x0, 0x0}, v32_int8 = {0x0, 0x0, 0x0, 0x0, 0xff, 0x0, 0x0, 0x0, 
    0xff, 0x0, 0x0, 0x0, 0xff, 0x0 <repeats 19 times>}, v16_int16 = {0x0, 0x0, 0xff, 0x0, 0xff, 0x0, 0xff, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}, 
  v8_int32 = {0x0, 0xff, 0xff, 0xff, 0x0, 0x0, 0x0, 0x0}, v4_int64 = {0xff00000000, 0xff000000ff, 0x0, 0x0}, v2_int128 = {0x000000ff000000ff000000ff00000000, 
    0x00000000000000000000000000000000}}

Run Code Online (Sandbox Code Playgroud)

由于-61是一个整数,它最终在一个整数寄存器中,在这里,我们可以看到它在rsi.(它已经被符号扩展,这就是为什么它0xffffffc3:-61,4个字节,而不是一个.)然而%f,作为一个浮点数,很可能会读取浮点寄存器,例如ymm0在我的机器上.它碰巧是零.这不一定是真的,因为这是未定义的行为,但是,它是,因此,我们将得到零.

¹除了病态的好奇心,这不是你经常关心的事情之一.
²我无法解释的唯一部分是为什么我们的整数最终进入rsi.我觉得应该进去了rdi.就像我说的那样,病态的好奇心.(编辑:呃,诅咒我的好奇心.它最终rdi因为rdi用于第二个参数,而且它是第二个参数.维基百科将它标记为"从右到左",但这只适用于堆栈上的东西:寄存器是从左到右分配.)

归档时间：	11 年，5 月前
查看次数：	111 次
最近记录：	11 年，5 月前