如何在Haskell中对手动分配的数据进行垃圾收集?

imz*_*hev 6 garbage-collection haskell memory-management ffi

我正在考虑从Haskell调用一些C函数的FFI.

如果内存缓冲区用于保存某些数据并且是"手动"分配然后在Haskell计算中使用,那么我可以以某种方式依赖垃圾收集器在不再需要它时释放它.

至于手动分配,基本上有两种方式(但差异对于我的问题似乎并不重要):

  • 在Haskell中分配缓冲区,然后将其传递给C函数,就像在 fdRead
  • 在C中分配一个缓冲区(malloc就像在GNU中一样asprintf),然后返回指向Haskell的指针

在两个示例(fdReadasprintf)中,还存在一个问题,即存储在缓冲区中的数据类型不适合Haskell程序,因此将其复制并转换为在Haskell(with peekCString)中使用.(我将把代码放在下面.)复制和转换发生后,缓冲区被释放(在两种情况下).

但是,我正在考虑一个更有效的接口,Haskell将直接使用数据,因为它是由C函数存储的(没有转换).(我还没有探讨过替代实现String和相关函数:它们中是否有一个可以直接使用某种C字符串.)

如果我遵循这条路线,那么就存在一个全局问题:如何控制已分配缓冲区的处理.(对于没有副作用的函数 - 除了分配 - 我甚至可以将调用包装起来unsafePerformIO或声明它们,以便它们不是IO.)

转换和立即释放的示例

在Haskell中分配:

fdRead(这里allocaBytes必须关心释放):

-- -----------------------------------------------------------------------------
-- fd{Read,Write}

-- | Read data from an 'Fd' and convert it to a 'String' using the locale encoding.
-- Throws an exception if this is an invalid descriptor, or EOF has been
-- reached.
fdRead :: Fd
       -> ByteCount -- ^How many bytes to read
       -> IO (String, ByteCount) -- ^The bytes read, how many bytes were read.
fdRead _fd 0 = return ("", 0)
fdRead fd nbytes = do
    allocaBytes (fromIntegral nbytes) $ \ buf -> do
    rc <- fdReadBuf fd buf nbytes
    case rc of
      0 -> ioError (ioeSetErrorString (mkIOError EOF "fdRead" Nothing Nothing) "EOF")
      n -> do
       s <- peekCStringLen (castPtr buf, fromIntegral n)
       return (s, n)

-- | Read data from an 'Fd' into memory.  This is exactly equivalent
-- to the POSIX @read@ function.
fdReadBuf :: Fd
          -> Ptr Word8 -- ^ Memory in which to put the data
          -> ByteCount -- ^ Maximum number of bytes to read
          -> IO ByteCount -- ^ Number of bytes read (zero for EOF)
fdReadBuf _fd _buf 0 = return 0
fdReadBuf fd buf nbytes =
  fmap fromIntegral $
    throwErrnoIfMinus1Retry "fdReadBuf" $
      c_safe_read (fromIntegral fd) (castPtr buf) nbytes

foreign import ccall safe "read"
   c_safe_read :: CInt -> Ptr CChar -> CSize -> IO CSsize
Run Code Online (Sandbox Code Playgroud)

在C中分配

getValue.c:

#define _GNU_SOURCE
#include <stdio.h>

#include "getValue.h"

char * getValue(int key) {
  char * value;
  asprintf(&value, "%d", key); // TODO: No error handling!
  // If memory allocation wasn't possible, or some other error occurs,  these  functions  will
  // return -1, and the contents of strp is undefined.
  return value;
}
Run Code Online (Sandbox Code Playgroud)

GetValue.hs(这里我实际完成free转换后显式调用):

{-# LANGUAGE ForeignFunctionInterface #-}

import Foreign hiding (unsafePerformIO)
import Foreign.Ptr
import Foreign.C.Types

import Foreign.C.String(peekCString)

import System.IO.Unsafe

getValue :: Int -> IO String
getValue key = do
  valptr <- c_safe_getValue (fromIntegral key)
  value <- peekCString valptr
  c_safe_free valptr
  return value

foreign import ccall safe "getValue.h getValue" c_safe_getValue :: CInt -> IO (Ptr CChar)
foreign import ccall safe "stdlib.h free" c_safe_free :: Ptr a -> IO ()

value :: Int -> String
value = unsafePerformIO . getValue -- getValue has no side-effects, so we wrap it.

{- A simple test: -}
main1 = putStrLn (value 2)

{- A test with an infinite list, which employs laziness: -}
keys = [-5..]
results = map value keys

main = foldr (>>) 
             (return ())
             (map putStrLn (take 20 results))
Run Code Online (Sandbox Code Playgroud)

如果没有(无效)转换和复制步骤,我需要依赖垃圾收集器来释放,但不知道如何在Haskell中定义这样的东西.

chi*_*chi 2

ForeignPtr类型充当Ptr带有附加终结器的 a。当ForeignPtr垃圾回收时,终结器将运行,并且可以调用 C 端以使用适当的函数释放指针。

由于不再可以从 Haskell 访问该指针,因此这通常是释放它的最佳时机。