Sec*_*coe 4 parsing haskell parsec
我试着这样做:
解析表单中的文本:
一些文字#{0,0,0}一些文字#{0,0,0}#{0,0,0}更多文字#{0,0,0}
进入一些数据结构的列表:
[内部"一些文本",外部(0,0,0),内部"一些文本",外部(0,0,0),外部(0,0,0),内部"更多文本",外部(0, 0,0)]
所以这些#{a,b,c} -bits应该变成与文本其余部分不同的东西.
我有这个代码:
module ParsecTest where
import Text.ParserCombinators.Parsec
import Monad
type Reference = (Int, Int, Int)
data Transc = Inside String | Outside Reference
deriving (Show)
text :: Parser Transc
text = do
x <- manyTill anyChar ((lookAhead reference) <|> (eof >> return (Inside "")));
return (Inside x)
transc = reference <|> text
alot :: Parser [Transc]
alot = do
manyTill transc eof
reference :: Parser Transc
reference = try (do{ char '#';
char '{';
a <- number;
char ',';
b <- number;
char ',';
c <- number;
char '}';
return (Outside (a,b,c)) })
number :: Parser Int
number = do{ x <- many1 digit;
return (read x) }
Run Code Online (Sandbox Code Playgroud)
这按预期工作.您可以通过键入在ghci中进行测试
parseTest很多"Some Text#{0,0,0} some text#{0,0,0}#{0,0,0} more text#{0,0,0}"
但我觉得这不好.
1)lookAhead对我的问题使用真的有必要吗?
2)这是return (Inside "")一个丑陋的黑客?
3)通常是否有更简洁/更智能的方式来实现同样的目标?
1)我认为你确实需要lookAhead你需要解析的结果.通过Parser (Transc,Maybe Transc)指示Inside带有可选跟随符来避免运行该解析器两次会很好Outside.如果性能是一个问题,那么这是值得做的.
2)是的.
3)Applicatives
number2 :: Parser Int
number2 = read <$> many1 digit
text2 :: Parser Transc
text2 = (Inside .) . (:)
<$> anyChar
<*> manyTill anyChar (try (lookAhead reference2) *> pure () <|> eof)
reference2 :: Parser Transc
reference2 = ((Outside .) .) . (,,)
<$> (string "#{" *> number2 <* char ',')
<*> number2
<*> (char ',' *> number2 <* char '}')
transc2 = reference2 <|> text2
alot2 = many transc2
Run Code Online (Sandbox Code Playgroud)
您可能想要重写reference2使用帮助器的开头aux x y z = Outside (x,y,z).
编辑:更改text为处理不以结尾的输入Outside.