为什么不打印强制整个惰性 IO 值?

sev*_*evo 2 haskell conduit lazy-io haskell-pipes http-conduit

我正在使用http-client教程使用 TLS 连接获取响应正文。既然我可以观察到它print是由 调用的withResponse,为什么不print强制对以下片段中的输出进行完整响应?

withResponse request manager $ \response -> do
    putStrLn $ "The status code was: " ++
    body <- (responseBody response)
    print body
Run Code Online (Sandbox Code Playgroud)

我需要写这个:

response <- httpLbs request manager

putStrLn $ "The status code was: " ++
           show (statusCode $ responseStatus response)
print $ responseBody response
Run Code Online (Sandbox Code Playgroud)

我要打印的正文是一个懒惰的 ByteString。我仍然不确定是否应该print打印整个值。

instance Show ByteString where
    showsPrec p ps r = showsPrec p (unpackChars ps) r
Run Code Online (Sandbox Code Playgroud)

Mic*_*ael 5

这与懒惰无关,而是与Response L.ByteStringSimple 模块和Response BodyReaderTLS 模块之间的区别。

你注意到 aBodyReader是一个IO ByteString。但特别是它是一个可以重复的动作,每次都使用下一个字节块。它遵循从不发送空字节串的协议,除非它位于文件末尾。(BodyReader可能已被调用ChunkGetter)。bip下面就像你写的一样:从 中提取BodyReader/IO ByteStringResponse,它执行它以获取第一个块,并打印它。但不会重复操作以获得更多 - 所以在这种情况下,我们只看到创世记的前几章。您需要的是一个循环来耗尽块,bop如下所示,这会导致整个 King James Bible 溢出到控制台中。

{-# LANGUAGE OverloadedStrings #-} 
import Network.HTTP.Client
import Network.HTTP.Client.TLS
import qualified Data.ByteString.Char8 as B

main = bip
-- main = bop

bip = do 
  manager <- newManager tlsManagerSettings
  request <- parseRequest "https://raw.githubusercontent.com/michaelt/kjv/master/kjv.txt"
  withResponse request manager $ \response -> do
      putStrLn "The status code was: "  
      print (responseStatus response)
      chunk  <- responseBody response
      B.putStrLn chunk

bop = do 
  manager <- newManager tlsManagerSettings
  request <- parseRequest "https://raw.githubusercontent.com/michaelt/kjv/master/kjv.txt"
  withResponse request manager $ \response -> do
      putStrLn "The status code was: " 
      print (responseStatus response)
      let loop = do 
            chunk <- responseBody response
            if B.null chunk 
              then return () 
              else B.putStr chunk  >> loop 
      loop
Run Code Online (Sandbox Code Playgroud)

循环不断返回以获取更多块,直到它得到一个空字符串,它代表 eof,因此在终端中它会打印到启示录的结尾。

这种行为很简单,但有点技术性。您只能使用BodyReader手写递归。但http-client图书馆的目的是让事情成为http-conduit可能。那里的结果withResponse有类型Response (ConduitM i ByteString m ())ConduitM i ByteString m ()是字节流的管道类型;这个字节流将包含整个文件。

http-client/http-conduit材料的原始形式中,Response包含这样的导管;该BodyReader部分后来被分解出来,http-client以便它可以被不同的流媒体库使用,例如pipes.

所以举一个简单的例子,在streamingstreaming-bytestring库的相应http材料中,withHTTP给你一个类型的响应Response (ByteString IO ())ByteString IO ()是 IO 中出现的字节流的类型,顾名思义;ByteString Identity ()将相当于一个惰性字节串(实际上是一个纯粹的块列表。)ByteString IO ()在这种情况下,will 代表整个字节流,直到启示录。所以与进口

 import qualified Data.ByteString.Streaming.HTTP as Bytes -- streaming-utils
 import qualified Data.ByteString.Streaming.Char8 as Bytes -- streaming-bytestring
Run Code Online (Sandbox Code Playgroud)

该程序与惰性字节串程序相同:

bap = do 
    manager <- newManager tlsManagerSettings
    request <- parseRequest "https://raw.githubusercontent.com/michaelt/kjv/master/kjv.txt"
    Bytes.withHTTP request manager $ \response -> do 
        putStrLn "The status code was: "
        print (responseStatus response)
        Bytes.putStrLn $ responseBody response
Run Code Online (Sandbox Code Playgroud)

实际上它稍微简单一些,因为您没有“从 IO 中提取字节”:

        lazy_bytes <- responseStatus response
        Lazy.putStrLn lazy_bytes
Run Code Online (Sandbox Code Playgroud)

但只要写

        Bytes.putStrLn $ responseBody response
Run Code Online (Sandbox Code Playgroud)

您只需直接“打印”它们即可。如果您只想从 KJV 的中间查看一点,您可以使用懒惰的字节串做您想做的事情,并以:

        Bytes.putStrLn $ Bytes.take 1000 $ Bytes.drop 50000 $ responseBody response
Run Code Online (Sandbox Code Playgroud)

然后你会看到一些关于亚伯拉罕的事。

The withHTTP for streaming-bytestring just hides the recursive looping that we needed to use the BodyReader material from http-client directly. It's the same e.g. with the withHTTP you find in pipes-http, which represents a stream of bytestring chunks as Producer ByteString IO (), and the same with http-conduit. In all of these cases, once you have your hands on the byte stream you handle it in the ways typical of the streaming IO framework without handwritten recursion. All of them use the BodyReader from http-client to do this, and this was the main purpose of the library.