piz*_*zet 11 haskell knuth-morris-pratt aho-corasick
我在理解Haskell中的Knuth-Morris-Pratt算法的实现时遇到了麻烦.
http://twanvl.nl/blog/haskell/Knuth-Morris-Pratt-in-Haskell
特别是我不了解自动机的结构.我知道它使用"绑结"方法来构造它,但我不清楚,我也不知道为什么它应该具有正确的复杂性.
我想知道的另一件事是你是否认为这个实现可以很容易地推广到实现Aho-Corasick算法.
谢谢你的回答!
所以这是算法:
makeTable :: Eq a => [a] -> KMP a
makeTable xs = table
where table = makeTable' xs (const table)
makeTable' [] failure = KMP True failure
makeTable' (x:xs) failure = KMP False test
where test c = if c == x then success else failure c
success = makeTable' xs (next (failure x))
Run Code Online (Sandbox Code Playgroud)
使用它,让我们看看为"shoeshop"构建的表格:
makeTable "shoeshop" = table0
table0 = makeTable' "shoeshop" (const table0)
= KMP False test0
test0 c = if c == 's' then success1 else const table0 c
= if c == 's' then success1 else table0
success1 = makeTable' "hoeshop" (next (const table0 's'))
= makeTable' "hoeshop" (next table0)
= makeTable' "hoeshop" test0
= KMP False test1
test1 c = if c == 'h' then success2 else test0 c
success2 = makeTable' "oeshop" (next (test0 'h'))
= makeTable' "oeshop" (next table0)
= makeTable' "oeshop" test0
= makeTable' "oeshop" test0
= KMP False test2
test2 c = if c == 'o' then success3 else test0 c
success3 = makeTable' "eshop" (next (test0 'o'))
= makeTable' "eshop" (next table0)
= makeTable' "eshop" test0
= KMP False test3
test3 c = if c == 'e' then success4 else test0 c
success4 = makeTable' "shop" (next (test0 'e'))
= makeTable' "shop" (next table0)
= makeTable' "shop" test0
= KMP False test4
test4 c = if c == 's' then success5 else test0 c
success5 = makeTable' "hop" (next (test0 's'))
= makeTable' "hop" (next success1)
= makeTable' "hop" test1
= KMP False test5
test5 c = if c == 'h' then success6 else test1 c
success6 = makeTable' "op" (next (test1 'h'))
= makeTable' "op" (next success2)
= makeTable' "op" test2
= KMP False test6
test6 c = if c == 'o' then success7 else test2 c
success7 = makeTable' "p" (next (test2 'o'))
= makeTable' "p" (next success3)
= makeTable' "p" test3
= KMP False test7
test7 c = if c == 'p' then success8 else test3 c
success8 = makeTable' "" (next (test3 'p'))
= makeTable' "" (next (test0 'p'))
= makeTable' "" (next table0)
= makeTable' "" test0
= KMP True test0
Run Code Online (Sandbox Code Playgroud)
注意如何success5使用消耗的's'来回溯模式的初始's'.
现在来看看你做的事情isSubstringOf2 "shoeshop" $ cycle "shoe".
看到,当test7不匹配"P",它退到test3尝试匹配"E",所以通过我们循环success4,success5,success6和success7循环往复.