python中文本分析器代码的时间复杂度

Question

python中文本分析器代码的时间复杂度

Imr*_*fin 2 python algorithm time-complexity

letterList = ["a", 0, "b", 0, "c", 0, "d", 0, "e", 0, "f", 0, "g", 0, "h", 0, "i", 0,  "j", 0, "k", 0, "l", 0, "m", 0, "n", 0, "o", 0, "p", 0, "q", 0, "r", 0, "s", 0, "t", 0, "u", 0, "v", 0, "w", 0, "x", 0, "y", 0, "z", 0]
letterCount = 0
wordList = [None]
wordCount = 0
Count = 0
wordIndex = [0]
itext = cleaner(raw_input("enter itext please")).split()
print itext
for iword in itext:
    if iword in wordList:
        Count += 1
        for word in wordList:
            if iword == word:
                wordList[wordList.index(word)+1][0] += 1
                wordList[wordList.index(word)+1] += [wordCount]
            else:
                pass
    elif iword not in wordList:
        wordList += [iword]
        wordList += [[1, itext.index(iword)]]
    else:
        pass
    wordCount += 1
print wordList

Run Code Online (Sandbox Code Playgroud)

代码看起来很混乱,因为我在python和编程中都是排名初学者.

任何人都可以帮助我处理代码的时间复杂性吗？

Answer 1

Pau*_*kin 6

除了格式不同之外,之后的所有内容print itext都可以替换为:

print collections.Counter(itext)

Run Code Online (Sandbox Code Playgroud)

这具有复杂度O(n).

如果没有Counter,您可以使用dict而不是列表来更好地表达算法来存储单词计数:

word_counter = {}
for word in itext:
    if word in word_counter:
        word_counter[word] += 1
    else:
        word_counter[word] = 1

Run Code Online (Sandbox Code Playgroud)

dict非常适合存储某些东西(这里是一个单词)和其他内容之间的关联(这里是一个计数).单词和计数的交替对的列表相对于dict具有相当多的缺点,但是杀手是在列表中找到单词是O(N)而不是O(1).

归档时间：	13 年，9 月前
查看次数：	325 次
最近记录：	13 年，9 月前