在Haskell中解压缩GZip

Question

在Haskell中解压缩GZip

我很难搞清楚这一点.这是我正在尝试的:

ghci> :m +System.FileArchive.GZip  -- From the "MissingH" package
ghci> fmap decompress $ readFile "test.html.gz"
*** Exception: test.html.gz: hGetContents: invalid argument (invalid byte sequence)

Run Code Online (Sandbox Code Playgroud)

为什么我得到那个例外？

我也尝试Codec.Compression.GZip.decompress过zlib包,但是我无法获得类型String而不是ByteString.

Answer 1

ham*_*mar 8

从转换ByteString到String取决于压缩文件的字符编码,但假设它是ASCII或Latin-1,这应该工作:

import Codec.Compression.GZip (decompress)
import qualified Data.ByteString.Lazy as LBS
import Data.ByteString.Lazy.Char8 (unpack)

readGZipFile :: FilePath -> IO String
readGZipFile path = fmap (unpack . decompress) $ LBS.readFile path

Run Code Online (Sandbox Code Playgroud)

如果您需要使用其他编码(如UTF-8),请unpack使用适当的解码函数替换,例如Data.ByteString.Lazy.UTF8.toString.

当然,如果您要解压缩的文件不是文本文件,最好将其保存为ByteString.

如果是,则解压缩然后解码为文本 (2认同)

归档时间：	13 年，10 月前
查看次数：	791 次
最近记录：	12 年，1 月前