IndentParser示例

Nic*_*lev 8 parsing yaml haskell

有人可以发一个使用IndentParser的小例子吗?我想解析类似YAML的输入,如下所示:

fruits:
    apples: yummy
    watermelons: not so yummy

vegetables:
    carrots: are orange
    celery raw: good for the jaw
Run Code Online (Sandbox Code Playgroud)

我知道有一个YAML包.我想学习IndentParser的用法.

ste*_*ley 2

我在下面概述了一个解析器,对于您的问题,您可能只需要 IndentParser 中的块解析器。请注意,我还没有尝试运行它,因此它可能存在基本错误。

解析器最大的问题并不是真正的缩进,而是只有字符串和冒号作为标记。您可能会发现下面的代码需要相当多的调试,因为它必须非常敏感,不要消耗太多输入,尽管我已经尝试小心左因子分解。因为您只有两个令牌,所以您无法从 Parsec 的令牌模块获得太多好处。

请注意,解析有一个奇怪的事实:看起来简单的格式通常解析起来并不简单。对于学习而言,为简单表达式编写解析器将教会您比或多或少任意文本格式更多的知识(这可能只会让您感到沮丧)。

data DefinitionTree = Nested String [DefinitionTree]
                    | Def String String
  deriving (Show)


-- Note - this might need some testing.
--
-- This is a tricky one, the parser has to parse trailing 
-- spaces and tabs but not a new line.
--
category :: IndentCharParser st String
category = do 
    { a <- body 
    ; rest 
    ; return a
    } 
  where
    body = manyTill1 (letter <|> space) (char ':') 
    rest = many (oneOf [' ', '\t'])

-- Because the DefinitionTree data type has two quite 
-- different constructors, both sharing the same prefix
-- 'category' this combinator is a bit more complicated
-- than usual, and has to use an Either type to descriminate
-- between the options. 
-- 
definition :: IndentCharParser st DefinitionTree
definition = do 
    { a <- category
    ; b <- (textL <|> definitionsR)
    ; case b of
        Left ss -> return (Def a ss)
        Right ds -> return (Nested a ds)
    }

-- Note this should parse a string *provided* it is on 
-- the same line as the category.
--
-- However you might find this assumption needs verifying...
--
textL :: IndentCharParser st (Either DefinitionTrees a)
textL = do 
    { ss <- manyTill1 anyChar "\n" 
    ; return (Left ss)
    }

-- Finally this one uses an indent parser.
--
definitionsR :: IndentCharParser st (Either a [DefinitionTree]) 
definitionsR = block body 
  where 
    body = do { a <- many1 definition; return (Right a) }
Run Code Online (Sandbox Code Playgroud)