Python:比较几千个字符串.任何比较快速的替代方案?

Ric*_*son 2 python string comparison performance string-comparison

我有一组大约6 000个数据包,为了进行比较,我将其表示为 字符串(前28个字节),以便与尽可能多的数据包进行比较,我也将其表示为28个字节的字符串.

我必须匹配一组中的每个数据包与所有其他数据包.匹配始终是唯一的.

我发现比较字符串需要一些时间.有没有办法加快这个过程?

EDIT1:我不想置换字符串元素因为我总是确保保留包列表和相应字符串列表之间的顺序.

EDIT2:这是我的实现:

list1, list2 # list of packets (no duplicates present in each list!)
listOfStrings1, listOfStrings2 # corresponding list of strings. Ordering is preserved.
alreadyMatchedlist2Indices = []
for list1Index in xrange(len(listOfStrings1)):
            stringToMatch = listOfStrings1[list1Index]
            matchinglist2Indices = [i for i, list2Str in enumerate(listOfStrings2)
                                if list2Str == stringToMatch and i not in alreadyMatchedlist2Indices]
            if not matchinglist2Indices:
                tmpUnmatched.append(list1Index)
            elif len(matchinglist2Indices) == 1:
                tmpMatched.append([list1Index, matchinglist2Indices[0]])
                alreadyMatchedlist2Indices.append(matchinglist2Indices[0])
            else:
                list2Index = matchinglist2Indices[0] #taking first matching element anyway
                tmpMatched.append([list1Index, list2Index])
                alreadyMatchedlist2Indices.append(list2Index)
Run Code Online (Sandbox Code Playgroud)

Lew*_*ond 5

---在这里,我假设你一个接一个地把所有的字符串与其他所有字符串进行比较.---

我建议排序你的字符串列表并比较相邻的字符串.这应该有O(nlogn)的运行时间.