将String转换为ByteString的最佳方法是什么?

Tho*_*ing 31 string haskell bytestring

在Haskell中将String转换为ByteString的最佳方法是什么?

我对这个问题的直觉反应是

import qualified Data.ByteString as B
import Data.Char (ord)

packStr = B.pack . map (fromIntegral . ord)
Run Code Online (Sandbox Code Playgroud)

但这似乎并不令人满意.

Pea*_*ker 26

Data.ByteString.UTF8.fromString也很有用.Char8版本将失去unicode-ness,UTF8将生成UTF8编码的ByteString.你必须选择其中一个.


the*_*utz 16

这是我的Haskell String/Text/ByteString严格/延迟转换的备忘单,假设所需的编码是UTF-8.Data.Text.Encoding库具有其他可用的编码.

请确保写(使用OverloadedStrings):

lazyByteString :: BL.ByteString
lazyByteString = "lazyByteString ä ß" -- BAD!
Run Code Online (Sandbox Code Playgroud)

这将以意想不到的方式进行编码.尝试

lazyByteString = BLU.fromString "lazyByteString ä ß" -- good
Run Code Online (Sandbox Code Playgroud)

代替.

"Text"类型的字符串文字在编码方面工作正常.

备忘单:

import Data.ByteString.Lazy as BL
import Data.ByteString as BS
import Data.Text as TS
import Data.Text.Lazy as TL
import Data.ByteString.Lazy.UTF8 as BLU -- from utf8-string
import Data.ByteString.UTF8 as BSU      -- from utf8-string
import Data.Text.Encoding as TSE
import Data.Text.Lazy.Encoding as TLE

-- String <-> ByteString

BLU.toString   :: BL.ByteString -> String
BLU.fromString :: String -> BL.ByteString
BSU.toString   :: BS.ByteString -> String
BSU.fromString :: String -> BS.ByteString

-- String <-> Text

TL.unpack :: TL.Text -> String
TL.pack   :: String -> TL.Text
TS.unpack :: TS.Text -> String
TS.pack   :: String -> TS.Text

-- ByteString <-> Text

TLE.encodeUtf8 :: TL.Text -> BL.ByteString
TLE.decodeUtf8 :: BL.ByteString -> TL.Text
TSE.encodeUtf8 :: TS.Text -> BS.ByteString
TSE.decodeUtf8 :: BS.ByteString -> TS.Text

-- Lazy <-> Strict

BL.fromStrict :: BS.ByteString -> BL.ByteString
BL.toStrict   :: BL.ByteString -> BS.ByteString
TL.fromStrict :: TS.Text -> TL.Text
TL.toStrict   :: TL.Text -> TS.Text
Run Code Online (Sandbox Code Playgroud)

请+1 Peaker的答案,因为他正确处理编码.

  • 我花了一点时间才弄清楚`Data.ByteString.UTF8`在包[`utf8-string`](https://hackage.haskell.org/package/utf8-string)中。 (4认同)

rob*_*obx 12

一种安全的方法将涉及编码unicode字符串:

import qualified Data.ByteString as B
import qualified Data.Text as T
import Data.Text.Encoding (encodeUtf8)

packStr'' :: String -> B.ByteString
packStr'' = encodeUtf8 . T.pack
Run Code Online (Sandbox Code Playgroud)

关于其他答案:Data.ByteString.Char8.pack实际上与问题中的版本相同,并且不太可能是您想要的:

import qualified Data.ByteString as B
import qualified Data.ByteString.Char8 as C
import qualified Data.Text as T
import Data.Text.Encoding (encodeUtf8)
import Data.Char (ord)

packStr, packStr', packStr'' :: String -> B.ByteString
packStr   = B.pack . map (fromIntegral . ord)
packStr'  = C.pack
packStr'' = encodeUtf8 . T.pack

*Main> packStr "hellö?"
"hell\246e"
*Main> packStr' "hellö?"
"hell\246e"
*Main> packStr'' "hellö?"
"hell\195\182\226\153\165"
Run Code Online (Sandbox Code Playgroud)

Data.ByteString.UTF8.fromString很好,但需要utf8-string包,而Data.Text.Encoding则带有Haskell平台.

  • 也可以使用Codec.Binary.UTF8.String (2认同)