如果它们具有重叠区域,则连接字符串

use*_*074 5 python while-loop python-2.7

我正在尝试编写一个脚本,它将在每个字符串的开头或结尾找到共享5个字母重叠区域的字符串(如下例所示).

facgakfjeakfjekfzpgghi
                 pgghiaewkfjaekfjkjakjfkj
                                    kjfkjaejfaefkajewf
Run Code Online (Sandbox Code Playgroud)

我正在尝试创建一个连接所有三个的新字符串,因此输出将是:

facgakfjeakfjekfzpgghiaewkfjaekfjkjakjfkjaejfaefkajewf
Run Code Online (Sandbox Code Playgroud)

编辑:

这是输入:

x = ('facgakfjeakfjekfzpgghi', 'kjfkjaejfaefkajewf', 'pgghiaewkfjaekfjkjakjfkj')
Run Code Online (Sandbox Code Playgroud)

**没有订购清单

我到目前为止所写的内容*但不正确:

def findOverlap(seq)
    i = 0
    while i < len(seq): 
        for x[i]:
        #check if x[0:5] == [:5] elsewhere

 x = ('facgakfjeakfjekfzpgghi', 'kjfkjaejfaefkajewf', 'pgghiaewkfjaekfjkjakjfkj')
findOverlap(x)
Run Code Online (Sandbox Code Playgroud)

Sve*_*ach 8

创建一个字典,将每个字符串的前5个字符映射到其尾部

strings = {s[:5]: s[5:] for s in x}
Run Code Online (Sandbox Code Playgroud)

和一组所有后缀:

suffixes = set(s[-5:] for s in x)
Run Code Online (Sandbox Code Playgroud)

现在找到前缀与任何后缀不匹配的字符串:

prefix = next(p for p in strings if p not in suffixes)
Run Code Online (Sandbox Code Playgroud)

现在我们可以遵循字符串链:

result = [prefix]
while prefix in strings:
    result.append(strings[prefix])
    prefix = strings[prefix][-5:]
print "".join(result)
Run Code Online (Sandbox Code Playgroud)