我无法理解下面使用Aho-Corasick alg进行字符串模式匹配的算法.
Procedure AC(y,n,q0)
INPUT: y<-array of m bytes representing the text input
(SQL Query Statement)
n<-integer representing the text length
(SQL Query Length)
q0<-initial state (first character in pattern)
2: State <-q0
3: For i = 1 to n do
4: While g ( State, y[i] = = fail) do
5: State ? f (State)
6: End While
7: State ? g(State,.y[i])
8: If o(State) then
9: Output i
10: Else
11: Output
12: End If
13: End for …Run Code Online (Sandbox Code Playgroud) 在一次采访中,我被要求提出可以容纳数百万个模式的数据结构,并通过它们快速搜索以找到最长的匹配模式.
例如,模式如下:
1- 8876 8893 87 | true
2- 8876 889 | false
3- 8876 8 | false
4- 887 | true
Run Code Online (Sandbox Code Playgroud)
输入是一个至少有2位且最多18位的数字,我们需要从数据结构中找到最长的匹配模式,并在结尾处提取布尔值.
例如,8876 8893 9943 53将匹配1并true返回.8876 8397 5430 74将匹配3并false返回.
我的回答是使用一棵树并key value在每个级别都有一个对列表.作为数字和值的键是null或等于boolean,具体取决于它是否是模式的结尾.喜欢:
# matching 8875
# start the search by first digit
[..., (7, null), (8, null), (9, null)]
^
[..., (7, null), (8, null), (9, null)]
^
[..., (7, true), (8, null), ...]
# at …Run Code Online (Sandbox Code Playgroud)