在for循环中读取csv文件时出现意外输出

1 python csv sentiment-analysis

import csv
if __name__ == "__main__":
    words = ["great" , "thanks"]
    with open("data/sentiwordnet.tsv", "r") as f:
        reader = csv.DictReader(f,delimiter='\t')
        for word in xrange(len(words)):
             for row in reader:
                 if row['word_en'] == words[word]:
                    print float(row["positive"])
                    print float(row["negative"])
                    print row["synset"]
Run Code Online (Sandbox Code Playgroud)

结果:

0.75
0.0 
124567
Run Code Online (Sandbox Code Playgroud)

以上结果仅适用于第一个单词,即"伟大".循环在这里结束 - 它不会继续下一个单词.

Ble*_*der 5

一旦遍历行reader,它就不会神奇地重新启动.交换for循环,以便只迭代一次reader:

for row in reader:
    for word in xrange(len(words)):
Run Code Online (Sandbox Code Playgroud)

我只是通过检查每个单词是否在你感兴趣的一组单词中来避免迭代两次.它会运行得更快:

import csv

if __name__ == "__main__":
    words = {"great" , "thanks"}  # sets are faster than lists for checking containment

    with open("data/sentiwordnet.tsv", "r") as f:
        reader = csv.DictReader(f, delimiter='\t')

        for row in reader:
            if row['word_en'] in words:
                print float(row["positive"])
                print float(row["negative"])
                print row["synset"]
Run Code Online (Sandbox Code Playgroud)

您可能还想考虑使用像pandas这样的包来处理表,它通常会让您的生活更轻松.