从文本文件中提取特定单词及其后面的值

San*_*ana 1 python extract cpu-word python-2.7

我有输入文件:

1 sentences, 6 words, 1 OOVs
1 zeroprobs, logprob= -21.0085 ppl= 15911.4 ppl1= 178704
6 words, rank1= 0 rank5= 0 rank10= 0
7 words+sents, rank1wSent= 0 rank5wSent= 0 rank10wSent= 0 qloss= 0.925606 absloss= 0.856944

file input.txt : 1 sentences, 6 words, 1 OOVs
1 zeroprobs, logprob= -21.0085 ppl= 15911.4 ppl1= 178704
6 words, rank1= 0 rank5= 0 rank10= 0
7 words+sents, rank1wSent= 0 rank5wSent= 0 rank10wSent= 0 qloss= 0.925606 absloss= 0.856944
Run Code Online (Sandbox Code Playgroud)

我想提取单词ppl和后面的值,在这种情况下:ppl = 15911.4

我正在使用此代码:

with open("input.txt") as openfile:
    for line in openfile:
       for part in line.split():
          if "ppl=" in part:
              print part
Run Code Online (Sandbox Code Playgroud)

然而,这只是提取单词ppl而不是值.我还想打印文件名.

预期产出:

input.txt, ppl=15911.4
Run Code Online (Sandbox Code Playgroud)

我怎样才能解决这个问题?

Avi*_*Raj 6

你可以使用enumerate功能,

with open("input.txt") as openfile:
    for line in openfile:
       s = line.split()
       for i,j in enumerate(s):
          if j == "ppl=":
              print s[i],s[i+1]
Run Code Online (Sandbox Code Playgroud)

例:

>>> fil = '''1 zeroprobs, logprob= -21.0085 ppl= 15911.4 ppl1= 178704
6 words, rank1= 0 rank5= 0 rank10= 0'''.splitlines()
>>> for line in fil:
        s = line.split()
        for i,j in enumerate(s):
            if j == "ppl=":
                print s[i],s[i+1]


ppl= 15911.4
>>> 
Run Code Online (Sandbox Code Playgroud)

要仅打印第一个值,

>>> for line in fil:
        s = line.split()
        for i,j in enumerate(s):
            if j == "ppl=":
                print s[i],s[i+1]
        break

ppl= 15911.4
Run Code Online (Sandbox Code Playgroud)