使用Parsec搜索模式

Question

使用Parsec搜索模式

不确定这是否可行(或推荐),但我实际上是在尝试使用Parsec在文件中搜索一系列字符.示例文件:

START (name)

junk
morejunk=junk;
dontcare
    foo ()
    bar

care_about this (stuff in here i dont care about);

don't care about this
or this
foo = bar;

also_care
about_this
(dont care whats in here);
and_this too(only the names
   at the front
   do i care about
);

foobar
may hit something = perhaps maybe (like this);
foobar

END

Run Code Online (Sandbox Code Playgroud)

这是我尝试让它运作:

careAbout :: Parser (String, String)
careAbout = do
    name1 <- many1 (noneOf " \n\r")
    skipMany space
    name2 <- many1 (noneOf " (\r\n")
    skipMany space
    skipMany1 parens
    skipMany space
    char ';'
    return (name1, name2)

parens :: Parser ()
parens = do
    char '('
    many (parens <|> skipMany1 (noneOf "()"))
    char ')'
    return ()

parseFile = do
    manyTill (do
        try careAbout <|>
        anyChar >> return ("", "")) (try $ string "END")

Run Code Online (Sandbox Code Playgroud)

我试图通过寻找来强制搜索careAbout,如果这不起作用,吃一个角色然后再试一次.我可以解析中间的所有垃圾(我知道它可能是什么),但我不关心它是什么(所以为什么还要解析它),而且它可能很复杂.

问题是,我的解决方案不太有效.anyChar最终消耗一切,寻找END永远不会有机会.此外,在careAbout我们击中的某个地方因为它而抛出eof一些Exception.

这可能是完全错误的做法,我想知道一种方法,甚至更好的方法,正确的做法.

Answer 1

Rom*_*aka 2

如果不是解析器parens，这将非常适合常规语言解析器，例如regex-applicative。这是因为常规语言解析器对于“回溯”更加“智能”（事实上根本没有回溯，但每个可能的分支都被探索了）。

但是，您可能知道，匹配括号不是常规语言。如果您可以放松语法以使其变得规则，请尝试 regex-applicative。

归档时间：	12 年，1 月前
查看次数：	206 次
最近记录：	6 年，1 月前