字符串列表的最长公共前缀(LCP)

bla*_*ing 19 list prolog prolog-cut prolog-dif logical-purity

lcs([ H|L1],[ H|L2],[H|Lcs]) :-
    !,
    lcs(L1,L2,Lcs).
lcs([H1|L1],[H2|L2],Lcs):-
    lcs(    L1 ,[H2|L2],Lcs1),
    lcs([H1|L1],    L2 ,Lcs2),
    longest(Lcs1,Lcs2,Lcs),
    !.
lcs(_,_,[]).

longest(L1,L2,Longest) :-
    length(L1,Length1),
    length(L2,Length2),
    (  Length1 > Length2
    -> Longest = L1
    ;  Longest = L2
    ).
Run Code Online (Sandbox Code Playgroud)

到目前为止这是我的代码.我怎样才能优化它以便打印前缀,例如:

["interview", "interrupt", "integrate", "intermediate"]
Run Code Online (Sandbox Code Playgroud)

应该回来 "inte"

Prolog有点生疏,有一段时间没做过:)

fal*_*lse 12

首先,让我们从相关的东西开始,但更简单.

:- set_prolog_flag(double_quotes, chars).  % "abc" = [a,b,c]

prefix_of(Prefix, List) :-
   append(Prefix, _, List).

commonprefix(Prefix, Lists) :-
   maplist(prefix_of(Prefix), Lists).

?- commonprefix(Prefix, ["interview", "integrate", "intermediate"]).
   Prefix = []
;  Prefix = "i"
;  Prefix = "in"
;  Prefix = "int"
;  Prefix = "inte"
;  false.
Run Code Online (Sandbox Code Playgroud)

(请参阅答案,如何使用双引号打印字符列表.)

这是Prolog相当容易的部分.唯一的缺点是它没有给我们最大值,而是所有可能的解决方案,包括最大值.请注意,不需要知道所有字符串,例如:

?- commonprefix(Prefix, ["interview", "integrate", Xs]).
   Prefix = []
;  Prefix = "i", Xs = [i|_A]
;  Prefix = "in", Xs = [i, n|_A]
;  Prefix = "int", Xs = [i, n, t|_A]
;  Prefix = "inte", Xs = [i, n, t, e|_A]
;  false.
Run Code Online (Sandbox Code Playgroud)

因此,我们得到对最后一个未知单词的部分描述.现在想象一下,后来我们意识到Xs = "induce".Prolog没问题:

?- commonprefix(Prefix, ["interview", "integrate", Xs]), Xs = "induce".
   Prefix = [], Xs = "induce"
;  Prefix = "i", Xs = "induce"
;  Prefix = "in", Xs = "induce"
;  false.
Run Code Online (Sandbox Code Playgroud)

事实上,我们是在后见之明还是在实际查询之前说明这一点并没有什么区别:

?- Xs = "induce", commonprefix(Prefix, ["interview", "integrate", Xs]).
   Xs = "induce", Prefix = []
;  Xs = "induce", Prefix = "i"
;  Xs = "induce", Prefix = "in"
;  false.
Run Code Online (Sandbox Code Playgroud)

我们现在可以根据这个制定最大值吗?请注意,这实际上需要某种形式的额外量子,我们在Prolog中没有任何直接规定.出于这个原因,我们必须将我们限制在我们知道安全的某些情况下.最简单的方法是坚持单词列表不包含任何变量.我将iwhen/2用于此目的.

maxprefix(Prefix, Lists) :-
   iwhen(ground(Lists), maxprefix_g(Prefix, Lists)).

maxprefix_g(Prefix, Lists_g) :-
   setof(N-IPrefix, ( commonprefix(IPrefix, Lists_g), length(IPrefix, N ) ), Ns),
   append(_,[N-Prefix], Ns).   % the longest one
Run Code Online (Sandbox Code Playgroud)

这种方法的缺点是,如果不知道单词列表,我们会得到实例化错误.

请注意,我们做了很多假设(我希望这些假设真的存在).特别是我们假设只有一个最大值.在这种情况下,这是成立的,但一般情况下可能有几个独立的值Prefix.而且,我们认为IPrefix永远都是基础.我们也可以检查一下,只是为了确定.或者:

maxprefix_g(Prefix, Lists_g) :-
   setof(N, IPrefix^ ( commonprefix(IPrefix, Lists_g), length(IPrefix, N ) ), Ns),
   append(_,[N], Ns),
   length(Prefix, N),
   commonprefix(Prefix, Lists_g).
Run Code Online (Sandbox Code Playgroud)

这里,前缀不必是一个单独的前缀(在我们的情况下).

然而,最好的是一个更纯粹的版本,根本不需要诉诸实例化错误.


Fat*_*ize 7

以下是我将如何实现这一点:

:- set_prolog_flag(double_quotes, chars).

longest_common_prefix([], []).
longest_common_prefix([H], H).
longest_common_prefix([H1,H2|T], P) :-
    maplist(append(P), L, [H1,H2|T]),
    (   one_empty_head(L)
    ;   maplist(head, L, Hs),
        not_all_equal(Hs)
    ).

one_empty_head([[]|_]).
one_empty_head([[_|_]|T]) :-
    one_empty_head(T).

head([H|_], H).

not_all_equal([E|Es]) :-
    some_dif(Es, E).

some_dif([X|Xs], E) :-
    if_(diffirst(X,E), true, some_dif(Xs,E)).

diffirst(X, Y, T) :-
    (   X == Y -> T = false
    ;   X \= Y -> T = true
    ;   T = true,  dif(X, Y)
    ;   T = false, X = Y
    ).
Run Code Online (Sandbox Code Playgroud)

实现not_all_equal/1是来自@repeat的答案(你可以在编辑历史中找到我的实现).

我们使用appendmaplist将列表中的字符串拆分为前缀和后缀,并且前缀对于所有字符串都是相同的.要使此前缀最长,我们需要声明至少两个后缀的第一个字符是不同的.

这就是为什么我们使用head/2,one_empty_head/1not_all_equal/1.head/2用于检索字符串的第一个字符; one_empty_head/1用于声明如果其中一个后缀为空,则自动这是最长的前缀.not_all_equal/1用于检查或说明至少两个字符不同.

例子

?- longest_common_prefix(["interview", "integrate", "intermediate"], Z).
Z = [i, n, t, e] ;
false.

?- longest_common_prefix(["interview", X, "intermediate"], "inte").
X = [i, n, t, e] ;
X = [i, n, t, e, _156|_158],
dif(_156, r) ;
false.

?- longest_common_prefix(["interview", "integrate", X], Z).
X = Z, Z = [] ;
X = [_246|_248],
Z = [],
dif(_246, i) ;
X = Z, Z = [i] ;
X = [i, _260|_262],
Z = [i],
dif(_260, n) ;
X = Z, Z = [i, n] ;
X = [i, n, _272|_274],
Z = [i, n],
dif(_272, t) ;
X = Z, Z = [i, n, t] ;
X = [i, n, t, _284|_286],
Z = [i, n, t],
dif(_284, e) ;
X = Z, Z = [i, n, t, e] ;
X = [i, n, t, e, _216|_224],
Z = [i, n, t, e] ;
false.

?- longest_common_prefix([X,Y], "abc").
X = [a, b, c],
Y = [a, b, c|_60] ;
X = [a, b, c, _84|_86],
Y = [a, b, c] ;
X = [a, b, c, _218|_220],
Y = [a, b, c, _242|_244],
dif(_218, _242) ;
false.

?- longest_common_prefix(L, "abc").
L = [[a, b, c]] ;
L = [[a, b, c], [a, b, c|_88]] ;
L = [[a, b, c, _112|_114], [a, b, c]] ;
L = [[a, b, c, _248|_250], [a, b, c, _278|_280]],
dif(_248, _278) ;
L = [[a, b, c], [a, b, c|_76], [a, b, c|_100]] ;
L = [[a, b, c, _130|_132], [a, b, c], [a, b, c|_100]];
…
Run Code Online (Sandbox Code Playgroud)

  • 高达'not_all_equal_/1`这是一种高度的Prologish方法! (3认同)
  • 请注意,如上所述设置Prolog标志,`[i,n,t,e] ="inte"`!所以他们是一样的.似乎我的答案如何得到如上所示的"inte"! (2认同)

rep*_*eat 7

以下是@CapelliC提出的(随后撤回)代码的纯化变体:

:- set_prolog_flag(double_quotes, chars).

:- use_module(library(reif)).

lists_lcp([], []).
lists_lcp([Es|Ess], Ls) :-
   if_((maplist_t(list_first_rest_t, [Es|Ess], [X|Xs], Ess0),
        maplist_t(=(X), Xs))
       , (Ls = [X|Ls0], lists_lcp(Ess0, Ls0))
       , Ls = []).

list_first_rest_t([], _, _, false).
list_first_rest_t([X|Xs], X, Xs, true).
Run Code Online (Sandbox Code Playgroud)

上面的 maplist_t/3是一个maplist/2 与术语平等/不平等具体化相关的变体 - maplist_t/5与更高的元素相同:

maplist_t(P_2, Xs, T) :-
   i_maplist_t(Xs, P_2, T).

i_maplist_t([], _P_2, true).
i_maplist_t([X|Xs], P_2, T) :-
   if_(call(P_2, X), i_maplist_t(Xs, P_2, T), T = false).

maplist_t(P_4, Xs, Ys, Zs, T) :-
   i_maplist_t(Xs, Ys, Zs, P_4, T).

i_maplist_t([], [], [], _P_4, true).
i_maplist_t([X|Xs], [Y|Ys], [Z|Zs], P_4, T) :-
   if_(call(P_4, X, Y, Z), i_maplist_t(Xs, Ys, Zs, P_4, T), T = false).
Run Code Online (Sandbox Code Playgroud)

首先这是一个地面查询:

?- lists_lcp(["a","ab"], []).
false.                                % fails (as expected)

以下是@Fatalize的正确答案中提出的查询.

?- lists_lcp(["interview",X,"intermediate"], "inte").
   X = [i,n,t,e]
;  X = [i,n,t,e,_A|_B], dif(_A,r)
;  false.

?- lists_lcp(["interview","integrate",X], Z).
   X = Z, Z = []
;  X = Z, Z = [i]
;  X = Z, Z = [i,n]
;  X = Z, Z = [i,n,t]
;  X = Z, Z = [i,n,t,e]
;  X = [i,n,t,e,_A|_B], Z = [i,n,t,e]
;  X = [i,n,t,_A|_B]  , Z = [i,n,t]  , dif(_A,e)
;  X = [i,n,_A|_B]    , Z = [i,n]    , dif(_A,t)
;  X = [i,_A|_B]      , Z = [i]      , dif(_A,n)
;  X = [_A|_B]        , Z = []       , dif(_A,i).

?- lists_lcp([X,Y], "abc").
   X = [a,b,c]      , Y = [a,b,c|_A]
;  X = [a,b,c,_A|_B], Y = [a,b,c]
;  X = [a,b,c,_A|_B], Y = [a,b,c,_C|_D], dif(_A,_C)
;  false.

?- lists_lcp(L, "abc").
   L = [[a,b,c]]
;  L = [[a,b,c],[a,b,c|_A]]
;  L = [[a,b,c,_A|_B],[a,b,c]]
;  L = [[a,b,c,_A|_B],[a,b,c,_C|_D]], dif(_A,_C)
;  L = [[a,b,c],[a,b,c|_A],[a,b,c|_B]]
;  L = [[a,b,c,_A|_B],[a,b,c],[a,b,c|_C]]
;  L = [[a,b,c,_A|_B],[a,b,c,_C|_D],[a,b,c]]
;  L = [[a,b,c,_A|_B],[a,b,c,_C|_D],[a,b,c,_E|_F]], dif(_A,_E) 
…
Run Code Online (Sandbox Code Playgroud)

最后,这是显示改进的确定性的查询:

?- lists_lcp(["interview","integrate","intermediate"], Z).
Z = [i,n,t,e].                              % succeeds deterministically
Run Code Online (Sandbox Code Playgroud)


rep*_*eat 7

之前的回答提出了一个基于的实现if_/3.

:- use_module(library(reif)).

这里有一个不同的看法:

lists_lcp([], []).
lists_lcp([Es|Ess], Xs) :-
   foldl(list_list_lcp, Ess, Es, Xs).                % foldl/4

list_list_lcp([], _, []).
list_list_lcp([X|Xs], Ys0, Zs0) :-
   if_(list_first_rest_t(Ys0, Y, Ys)                 % if_/3
      , ( Zs0 = [X|Zs], list_list_lcp(Xs, Ys, Zs) )
      ,   Zs0 = []
      ).

list_first_rest_t([], _, _, false).
list_first_rest_t([X|Xs], Y, Xs, T) :-
   =(X, Y, T).                                       % =/3
Run Code Online (Sandbox Code Playgroud)

我之前的答案中的几乎所有查询都给出了相同的答案,因此我不在此处显示.

lists_lcp([X,Y], "abc")但是,新代码不再普遍终止.

  • @炽烈.下载并安装`library(reif)`.我在答案中添加了一个链接. (2认同)
  • @炽烈.具体说明你得到的错误!"那个错误"并没有帮助我定位问题所在. (2认同)