GHC:奇怪条件下的分段错误

lew*_*urm 7 linux haskell ffi ghc segmentation-fault

我哈巴狗10.04更新了我的显影机从Ubuntu的Ubuntu的LTS 12.04(或ghc 6.12.1ghc 7.4.1)和我在均流项目碰到一个很奇怪的行为.

几个小时后,我把它减少到以下代码:

{-# LANGUAGE ForeignFunctionInterface #-}
module Main where

import Data.Word
import Text.Printf
import Foreign

foreign import ccall "dynamic"
   code_void :: FunPtr (IO ()) -> (IO ())

main :: IO ()
main = do
  entryPtr <- (mallocBytes 2)
  poke entryPtr (0xc390 :: Word16) -- nop (0x90); ret(0xc3) (little endian order)

  _ <- printf "entry point: 0x%08x\n" ((fromIntegral $ ptrToIntPtr entryPtr) :: Int)
  _ <- getLine -- for debugging
  code_void $ castPtrToFunPtr entryPtr
  putStrLn "welcome back"
Run Code Online (Sandbox Code Playgroud)

我试图在运行时生成一些代码,跳转到它,然后再回来.使用Makefile,一切都很好:

$ make 
ghc --make -Wall -O2 Main.hs -o stackoverflow_segv
[1 of 1] Compiling Main             ( Main.hs, Main.o )
Linking stackoverflow_segv ...
./stackoverflow_segv
entry point: 0x098d77e0

welcome back
Run Code Online (Sandbox Code Playgroud)

但是,如果我直接从shell调用二进制文件:

$ ./stackoverflow_segv 
entry point: 0x092547e0

Segmentation fault (core dumped)
Run Code Online (Sandbox Code Playgroud)

这种行为是可重现的(幸运的是?).

使用gdb,objdump/proc我想通了:

$ gdb -q stackoverflow_segv
Reading symbols from /home/lewurm/stackoverflow/stackoverflow_segv...(no debugging symbols found)...done.
(gdb) run
Starting program: /home/lewurm/stackoverflow/stackoverflow_segv
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
entry point: 0x080fc810
Run Code Online (Sandbox Code Playgroud)

在按Enter键之前,我切换到第二个终端:

$ cat /proc/`pgrep stackoverflow`/maps
[...]
08048000-080ea000 r-xp 00000000 08:01 2492678    /home/lewurm/stackoverflow/stackoverflow_segv
080ea000-080eb000 r--p 000a2000 08:01 2492678    /home/lewurm/stackoverflow/stackoverflow_segv
080eb000-080f1000 rw-p 000a3000 08:01 2492678    /home/lewurm/stackoverflow/stackoverflow_segv
080f1000-08115000 rw-p 00000000 00:00 0          [heap]
[...]
Run Code Online (Sandbox Code Playgroud)

又回来了:

<enter>
Program received signal SIGSEGV, Segmentation fault.
0x0804ce3c in s2aV_info ()
Run Code Online (Sandbox Code Playgroud)

嘘.让我们看看这段代码的作用:

$ objdump -D stackoverflow_segv | grep -C 3 804ce3c
 804ce31:       89 44 24 4c             mov    %eax,0x4c(%esp)
 804ce35:       83 ec 0c                sub    $0xc,%esp
 804ce38:       8b 44 24 4c             mov    0x4c(%esp),%eax
 804ce3c:       ff d0                   call   *%eax
 804ce3e:       83 c4 0c                add    $0xc,%esp
 804ce41:       83 ec 08                sub    $0x8,%esp
 804ce44:       8b 44 24 54             mov    0x54(%esp),%eax
Run Code Online (Sandbox Code Playgroud)

嗯,跳到*%eax.还有什么%eax

 (gdb) info reg eax
 eax            0x80fc810        135251984
Run Code Online (Sandbox Code Playgroud)

好吧,实际上它只是代码缓冲区.仰望/proc/*/maps告诉我们,这个页面是不是executeable(rw-p吧?).但是,在内部执行时也是如此make.

这有什么不对?

顺便说一下,代码也可以通过gist获得

编辑:ghc bug报告

lew*_*urm 1

临时解决方案是使用mprotect(3)内存区域并将其显式设置为可执行文件。mprotect(3)需要对齐的内存块,因此memalign(3)是必需的。

{-# LANGUAGE ForeignFunctionInterface #-}
module Main where

import Data.Word
import Text.Printf
import Foreign
import Foreign.C.Types

foreign import ccall "dynamic"
   code_void :: FunPtr (IO ()) -> (IO ())

foreign import ccall "static sys/mman.h"
  mprotect :: CUInt -> CUInt -> Int -> IO ()

foreign import ccall "static stdlib.h"
  memalign :: CUInt -> CUInt -> IO (Ptr a)


main :: IO ()
main = do
  entryPtr <- memalign 0x1000 0x2
  poke entryPtr (0xc390 :: Word16) -- nop (0x90); ret(0xc3) (little endian order)
  let i_entry = (fromIntegral $ ptrToIntPtr entryPtr) :: Int
  -- 0x7 = PROT_{READ,WRITE,EXEC}
  mprotect (fromIntegral i_entry) 2 0x7

  _ <- printf "entry point: 0x%08x\n" i_entry
  _ <- getLine -- for debugging
  code_void $ castPtrToFunPtr entryPtr
  putStrLn "welcome back"
Run Code Online (Sandbox Code Playgroud)