检测 2 个字符串相同但顺序不同

nfl*_*l-x 5 python string-matching

我的目标是检测 2 个字符串是否相同但顺序不同。

Example
"hello world my name is foobar" is the same as "my name is foobar world hello"
Run Code Online (Sandbox Code Playgroud)

我已经尝试过将两个字符串拆分为列表并在循环中进行比较。

text = "hello world my name is foobar"
textSplit = text.split()

pattern = "foobar is my name world hello"
pattern = pattern.split()

count = 0
for substring in pattern:
    if substring in textSplit:
        count += 1

if (count == len(pattern)):
    print ("same string detected")
Run Code Online (Sandbox Code Playgroud)

它返回了我的意图,但这实际上是正确和有效的方式吗?也许还有另一种方法。任何有关该主题的期刊建议都非常好。

编辑 1:重复的单词很重要

text = "fish the fish the fish fish fish"
pattern = "the fish" 
Run Code Online (Sandbox Code Playgroud)

它必须返回 false

Eri*_*nil 4

如果您想检查 2 个句子是否具有相同的单词(出现次数相同),您可以将句子拆分为单词并对它们进行排序:

>>> sorted("hello world my name is foobar".split())
['foobar', 'hello', 'is', 'my', 'name', 'world']
>>> sorted("my name is foobar world hello".split())
['foobar', 'hello', 'is', 'my', 'name', 'world']
Run Code Online (Sandbox Code Playgroud)

您可以在函数中定义检查:

def have_same_words(sentence1, sentence2):
    return sorted(sentence1.split()) == sorted(sentence2.split())

print(have_same_words("hello world my name is foobar", "my name is foobar world hello"))
# True

print(have_same_words("hello world my name is foobar", "my name is foobar world hello"))
# True

print(have_same_words("hello", "hello hello"))
# False

print(have_same_words("hello", "holle"))
# False
Run Code Online (Sandbox Code Playgroud)

如果大小写不重要,您可以比较小写句子:

def have_same_words(sentence1, sentence2):
    return sorted(sentence1.lower().split()) == sorted(sentence2.lower().split())

print(have_same_words("Hello world", "World hello"))
# True
Run Code Online (Sandbox Code Playgroud)

注意:您也可以使用collections.Counter代替sorted. 复杂度将O(n)代替O(n.log(n)),无论如何这都没有太大区别。import collections可能比对字符串进行排序需要更长的时间:

from collections import Counter

def have_same_words(sentence1, sentence2):
    return Counter(sentence1.lower().split()) == Counter(sentence2.lower().split())

print(have_same_words("Hello world", "World hello"))
# True

print(have_same_words("hello world my name is foobar", "my name is foobar world hello"))
# True

print(have_same_words("hello", "hello hello"))
# False

print(have_same_words("hello", "holle"))
# False
Run Code Online (Sandbox Code Playgroud)