最后用可选数据解析文本

och*_*les 6 parsing haskell parsec

请注意,随后发布此问题我自己设法得到了一个解决方案.请参阅此问题的结尾以获取最终答案.


我正在为org-mode文档开发一个小解析器,在这些文档中,标题可以有一个标题,并且可以选择包含标题的标签列表:

* Heading          :foo:bar:baz:
Run Code Online (Sandbox Code Playgroud)

但是,我在为此编写解析器时遇到了困难.以下是我现在正在使用的内容:

import Control.Applicative
import Text.ParserCombinators.Parsec

data Node = Node String [String]
            deriving (Show)

myTest = parse node "" "Some text here :tags:here:"

node = Node <$> (many1 anyChar) <*> tags

tags = (char ':') >> (sepEndBy1 (many1 alphaNum) (char ':'))
   <?> "Tag list"
Run Code Online (Sandbox Code Playgroud)

虽然我的简单tags解析器工作,但它不能在上下文中工作,node因为所有字符都用完了解析标题(many1 anyChar)的标题.此外,我无法更改此解析器以使用,noneOf ":"因为:在标题中有效.实际上,如果它位于标记列表中,那么它就是特殊的.

我有什么想法可以解析这个可选数据?

顺便说一句,这是我的第一个真正的Haskell项目,所以如果Parsec甚至不是适合这项工作的工具 - 请随意指出并提出其他选择!


好的,我现在得到了一个完整的解决方案,但它需要重构.以下作品:

import Control.Applicative hiding (many, optional, (<|>))
import Control.Monad
import Data.Char (isSpace)
import Text.ParserCombinators.Parsec

 data Node = Node { level :: Int, keyword :: Maybe String, heading :: String, tags :: Maybe [String] }
   deriving (Show)

parseNode = Node <$> level <*> (optionMaybe keyword) <*> name <*> (optionMaybe tags)
    where level = length <$> many1 (char '*') <* space
          keyword = (try (many1 upper <* space))
          name = noneOf "\n" `manyTill` (eof <|> (lookAhead (try (tags *> eof))))
          tags = char ':' *> many1 alphaNum `sepEndBy1` char ':'

myTest = parse parseNode "org-mode" "** Some : text here :tags: JUST KIDDING     :tags:here:"
myTest2 = parse parseNode "org-mode" "* TODO Just a node"
Run Code Online (Sandbox Code Playgroud)

Bil*_*ill 1

import Control.Applicative hiding (many, optional, (<|>))
import Control.Monad
import Text.ParserCombinators.Parsec

instance Applicative (GenParser s a) where
  pure = return
  (<*>) = ap

data Node = Node { name :: String, tags :: Maybe [String] }
  deriving (Show)

parseNode = Node <$> name <*> tags
  where tags = optionMaybe $ optional (string " :") *> many (noneOf ":\n") `sepEndBy` (char ':')
        name = noneOf "\n" `manyTill` try (string " :" <|> string "\n")

myTest = parse parseNode "" "Some:text here :tags:here:"
myTest2 = parse parseNode "" "Sometext here :tags:here:"
Run Code Online (Sandbox Code Playgroud)

结果:

*Main> myTest
Right (Node {name = "Some:text here", tags = Just ["tags","here",""]})
*Main> myTest2
Right (Node {name = "Sometext here", tags = Just ["tags","here",""]})
Run Code Online (Sandbox Code Playgroud)