有人可以向我解释一下这些 Iterator、Yield monad 类型和函数的含义吗?就像我是 5 一样?

Clu*_*ess 9 haskell

下面是代码。我在一定程度上理解了applicative、functor、traversable和monad。迭代器、yield 类型以及yield 函数是我最难理解的。例如,Iterator 中的 ior 和 Yield 中的 iora 是什么?traverseY 到底做了什么以及签名的含义是什么?而 Monad Cont 在这里的应用方式也让我很困惑。感谢您的阅读,如有任何意见,我们将不胜感激。

import Control.Monad.Cont
import Control.Monad.Random
import System.Random.Shuffle


data Iterator i o r =
    Result r
  | Susp o (i -> Iterator i o r)
--------------------------------------------------------------------------------------------
newtype Yield i o r a = Yield { unY :: Cont (Iterator i o r) a }

instance Functor (Yield i o r) where
  fmap f (Yield m) = Yield (fmap f m)

instance Applicative (Yield i o r) where
  pure a = Yield (pure a)
  (Yield mf) <*> (Yield ma) = Yield (mf <*> ma)

instance Monad (Yield i o r) where
  return a = Yield (return a)
  (Yield m) >>= k = Yield (m >>= \a -> unY (k a))

instance MonadCont (Yield i o r) where
  callCC c = Yield (callCC (\k -> unY (c (\a -> Yield (k a)))))

runYield :: Yield i o r r -> Iterator i o r
runYield (Yield m) = runCont m Result

yield :: o -> Yield i o r i
yield o = callCC (\k -> Yield (cont (\_ -> Susp o (\i -> runYield (k i)))))

-------------------------------------------------------------------------------------------
data Tree a = Empty | Node a (Tree a) (Tree a)

traverseY :: Tree a -> Yield (Tree a) (a,Tree a,Tree a) r ()
traverseY Empty = return ()
traverseY (Node a t1 t2) = do t <- yield (a, t1, t2); traverseY t
Run Code Online (Sandbox Code Playgroud)

HTN*_*TNW 10

An表示重复获取 type 值并输出 type 值Iterator i o r的过程,最终通过返回 a来中断其中一个 s 之后的迭代。例如ioir

accum :: Iterator Int Int Int
accum = go 0
  where go n | n >= 100 = Result n
             | otherwise = Susp n (\i -> go (n + i))
-- potential usage
main :: IO ()
main = go accum -- iterator returns modified copies instead of mutating, so prepare to replace the iterator through iteration/recursion
  where go (Result r) = putStrLn $ "Final total: " ++ show r -- check whether iterator has values/extract values by pattern matching; a finished iterator can return extra data of type r if it likes
        go (Susp o i) = do
          putStrLn $ "Running total: " ++ show o
          putStr $ "Add: "
          n <- readLn
          go (i n) -- bidirectional communication! get "incremented" iterator by feeding in an input value (you could write no-input iterators by setting i = ())
Run Code Online (Sandbox Code Playgroud)

是一个跟踪某个累加器的迭代器n。它将每个输入添加到累加器,并每次输出新的累加器。一旦累加器达到 100,它就会停止接受输入并返回最终值。请注意,由于一切都是不可变的,为了保持状态,每次更改状态时都Iterator必须返回自身的新版本。无论谁依次“遍历”,accum都必须使用返回值Iterator而不是accum它本身。在Python中,你可以写成accum

def accum(): # calling accum() instead creates a new object that mutates itself through iteration
  sum = 0
  while sum < 100: sum = sum + (yield sum)
  return sum

# same usage
def main():
  gen = accum()
  o = next(gen)
  try: # I was hoping the Python version would help understanding... but I think I overestimated how pythonic Python would be...
    while True:
      print(f"Running total: {o}")
      o = gen.send(int(input("Add: ")))
  except StopIteration as e:
    print(f"Final total: {e.value}")
Run Code Online (Sandbox Code Playgroud)

Yield i o r r被用作 的“构建器” Iterator i o r。您可以直接编写 的Yield接口的模拟Iterator

instance Functor (Iterator i o) where
  fmap f (Result x) = Result (f x)
  fmap f (Susp o i) = Susp o (fmap f . i)
instance Applicative (Iterator i o) where
  pure = Result
  liftA2 = liftM2
instance Monad (Iterator i o) where
  Result r >>= f = f r
  Susp o i >>= f = Susp o ((>>= f) . i)

-- yieldI x is the iterator that sends x to the caller and receives a value in return
-- in Python: def yieldI(x): return yield x
yieldI :: o -> Iterator i o i
yieldI x = Susp x Result
Run Code Online (Sandbox Code Playgroud)

例如,DFS 示例中的生成器是

data Tree a = Empty | Node a (Tree a) (Tree a)
dfsI :: Tree a -> Iterator b a (Tree b) -- yield elements of the tree (o = a) *and also* receive new elements to replace them (i = b), building a new tree (r = Tree b)
-- dfs = traverse yieldI
dfsI Empty = Result Empty
dfsI (Node l m r) = do
  l' <- yieldI l
  m' <- dfsI m
  r' <- dfsI r
  return (Node l' m' r')
Run Code Online (Sandbox Code Playgroud)

这里直接使用的问题Iterator i o r是效率低下。(请记住,Haskell 是通过延迟计算来理解以下内容的。)如果您“连接”许多迭代器,例如((x >>= f) >>= g) >>= h,那么当您尝试计算它时就会遇到麻烦。说x评估为Susp o i. 然后,首先评估较大的表达式,执行三个函数调用,进入>>=s,计算结果xSusp o i,然后创建新的挂起函数调用以生成Susp o (\a -> ((i a >>= f) >>= g) >>= h)。当您迭代此迭代器时(即提取 lambda 并使用一些参数调用它),每次迭代都必须遍历所有>>=挂在Iterator. 哎呀。(也许用更熟悉的术语来说,我们已经将迭代器串联实现为另一个“包装器”,当包装器上的包装器上的包装器上的包装器上有包装器时,这会变得很糟糕......)

使用Cont是避免这种情况的“标准技巧”。我们的想法是,x我们不直接处理迭代器,而是处理它的绑定函数(包装在ContYieldnewtypes 中)x' = \f -> x >>= f。请注意,将一元计算转换为其绑定函数是可逆的,因为x' return = x >>= return = x.

newtype Yield i o r a = Yield { unYield :: (a -> Iterator i o r) -> Iterator i o r }
instance Functor (Yield i o r) where
  fmap f (Yield r) = Yield (\k -> r $ k . f)
instance Applicative (Yield i o r) where
  pure x = Yield (\k -> k x)
  liftA2 = liftM2
instance Monad (Yield i o r) where
  Yield r >>= f = Yield (\k -> r $ \x -> unYield (f x) k)
-- newtype Yield i o r a = Yield { unYield :: (a -> Iterator i o r) -> Iterator i o r }
-- compare         (>>=) :: Iterator i o a -> (a -> Iterator i o r) -> Iterator i o r
-- giving
liftIY :: Iterator i o a -> Yield i o r a
liftIY x = Yield (x >>=)
-- and the law x >>= return = x inspires
runYield :: Yield i o r r -> Iterator i o r
runYield (Yield r) = r return
-- e.g.
yield :: o -> Yield i o r i
-- yield = liftIY . yieldI
yield x = Yield (\k -> Susp x (\i -> k i)) -- note this is the same as yours after you drill through newtype baggage
Run Code Online (Sandbox Code Playgroud)

((x >>= f) >>= g) >>= husingYield将使类似的术语出现在运行时,而不是having (((x' >>= liftIY . f) >>= liftIY . g) >>= liftIY . h) return。其计算结果为x' >>= _someFunction,因此之前的所有包装器都已折叠为一个,希望能提高效率。(当您迭代时,这个折叠的包装器将“替换自身”,依次执行fgh指定的行为。这是在Yield's中编码的>>=。)

-- magical free efficiency by replacing Iterator -> Yield, yieldI -> yield
dfs :: Tree a -> Yield b a r (Tree b)
dfs = traverse yield
-- when using Yields as builders, you will treat Yield i o _ r like Iterator i o r
-- the middle parameter should be polymorphic for builders like dfs
-- while the final consumer (particularly the "standard consumer" runYield) fixes it to something (runYield sets it to the final return type)
-- (this behavior is a loosely typed reflection of the Codensity monad)
Run Code Online (Sandbox Code Playgroud)

在最终的使用现场,Yields必须先被做成Iterators才可以迭代。你的dfsDirect

  1. 用于dfs = traverse yieldI构建一个Yield不会引起“意外二次左关联性”(技术术语:))的愤怒
  2. 使用Iterator构建一个。该迭代器遍历并生成/替换其元素。YieldrunYieldtr
  3. 通过 迭代该迭代器loop,其中...
  4. dfs通过线“对话” loop (Susp o i) = loop (i [o]):当它接收到它时,o它会发送[o]回迭代器,迭代器将其放入Tree它正在构建的结构中。
  5. 耗尽迭代器后,接收一个新的Tree,其中每个元素都被替换为单例列表 ( loop (Result r) = _)。
  6. 通过实例将所有列表连接在一起Foldable Tree

这是一种非常愚蠢的做事方式,因为订单不是来自,dfs而是来自Foldable Tree上一步中使用的实例。只是Iterator用作美化fmap功能。如果Foldable实例不同(例如BFS,甚至只是中序与预序),但您保留dfs为预序DFS(因此它将不再被写入traverse yield),dfsDirect则不会按照定义的实际顺序输出dfs!您可以编写一个函数来正确地将 an 转换Iterator为列表。

-- note the usage of () as an input type for "plain" iterators
-- since we cannot know what to pass in otherwise 
iToList :: Iterator () o r -> ([o], r)
iToList (Result r) = ([], r)
iToList (Susp o i) = (o : os, r)
  where ~(os, r) = iToList (i ())
Run Code Online (Sandbox Code Playgroud)

traverseY也有点奇怪。如果它接收到 a Node(作为初始值或作为迭代器输入),它会返回 的字段Node,否则Empty返回。它实际上并不“遍历”其输入;而是“遍历”其输入。您只需将该树作为输入发送即可将其发送到一个全新的树中。我认为这个想法是,当您迭代它时,您会发回Tree它之前返回的其中一个,因此它会迭代到叶子的路径。IMO 写起来会更好

data Direction = L | R
path :: Tree a -> Yield Direction a r () -- outputs root and goes in direction as told until it runs out of tree
path Empty = pure ()
path (Node x l r) = do
  d <- yield x
  case d of
    L -> path l
    R -> path r
-- potential use
elemBST :: Ord a => a -> Tree a -> Bool
elemBST x xs = go (runYield $ path xs)
  where go (Result ()) = False -- iteration ended without success
        go (Susp y i) = case compare x y of
          LT -> go (i L) -- go left
          EQ -> True     -- end
          GT -> go (i R) -- go right
Run Code Online (Sandbox Code Playgroud)