use*_*838 5 java algorithm parsing cyk
我正在阅读有关CYK算法的内容,并且有一部分我无法理解的伪代码.整个伪代码是:
let the input be a string S consisting of n characters: a1 ... an.
let the grammar contain r nonterminal symbols R1 ... Rr.
This grammar contains the subset Rs which is the set of start symbols.
let P[n,n,r] be an array of booleans. Initialize all elements of P to false.
for each i = 1 to n
for each unit production Rj -> ai
set P[i,1,j] = true
for each i = 2 to n -- Length of span
for each j = 1 to n-i+1 -- Start of span
for each k = 1 to i-1 -- Partition of span
for each production RA -> RB RC
if P[j,k,B] and P[j+k,i-k,C] then set P[j,i,A] = true
if any of P[1,n,x] is true (x is iterated over the set s, where s are all the indices for Rs) then
S is member of language
else
S is not member of language
Run Code Online (Sandbox Code Playgroud)
这些部分让我很困惑:
for each production RA -> RB RC
if P[j,k,B] and P[j+k,i-k,C] then set P[j,i,A] = true
Run Code Online (Sandbox Code Playgroud)
有人会给出一些关于这些伪代码的提示吗?
伪代码
对于每个产生式 R A → R B R C:
如果 P[j,k,B] 和 P[j+k,ik,C] 则设置 P[j,i,A] = true
应按如下方式解释。假设 P[j, k, B] 为真。这意味着从位置 j 开始的 k 个字符形成的字符串可以从非终结符 R B导出。如果 P[j + k, i - k, C] 也为 true,则可以从非终结符 R C 导出从位置 j + k 开始的 i - k 个字符形成的字符串。因此,由于 R A → R B R C是产生式,因此可以从 R A导出从位置 j 开始的 i 个字符组成的字符串。
我认为这可能有助于将该伪代码解释为
对于每个产生式 R A → R B R C:
如果 P[j,k,B] == true 且 P[j+k,ik,C] == true,则设置 P[j,i,A] = true
希望这可以帮助!