如何只使用python读取某个字符串后的文本文件中的行?

Bri*_*lip 11 python string file

使用python,我想在字典中读取特定字符串后面的文本文件中的所有行.我想在成千上万的文本文件中做到这一点.

我能够使用以下代码识别并打印出特定字符串('Abstract')(从此堆栈溢出答案获得):

for files in filepath:
    with open(files, 'r') as f:
        for line in f:
            if 'Abstract' in line:
                print line;
Run Code Online (Sandbox Code Playgroud)

但是我如何告诉python开始读取仅在字符串后面的行?

Pad*_*ham 19

当你到达你想要开始的那条线时,就开始另一个循环:

for files in filepath:
    with open(files, 'r') as f:
        for line in f:
            if 'Abstract' in line:                
                for line in f: # now you are at the lines you want
                    # do work
Run Code Online (Sandbox Code Playgroud)

文件对象是它自己的迭代器,所以当我们到达包含Abstract的行时,我们继续从该行迭代,直到我们使用了迭代器.

一个简单的例子:

gen  =  (n for n in xrange(8))

for x in gen:
    if x == 3:
        print("starting second loop")
        for x in gen:
            print("In second loop",x)
    else:
        print("In first loop", x)

In first loop 0
In first loop 1
In first loop 2
starting second loop
In second loop 4
In second loop 5
In second loop 6
In second loop 7
Run Code Online (Sandbox Code Playgroud)

您还可以使用itertools.dropwhile来消耗直到您想要的点.

from itertools import dropwhile

for files in filepath:
    with open(files, 'r') as f:
        dropped = dropwhile(lambda _line: "Abstract" not in _line, f)
        next(dropped,"")
        for line in dropped:
                print(line)
Run Code Online (Sandbox Code Playgroud)


Kro*_*tan 8

使用布尔值忽略到该点为止的行:

found_abstract = False
for files in filepath:
    with open(files, 'r') as f:
        for line in f:
            if 'Abstract' in line:
                found_abstract = True
            if found_abstract:
                #do whatever you want
Run Code Online (Sandbox Code Playgroud)


Jon*_*nts 8

您可以使用itertools.dropwhileitertools.islice在这里,伪例如:

from itertools import dropwhile, islice

for fname in filepaths:
    with open(fname) as fin:
        start_at = dropwhile(lambda L: 'Abstract' not in L.split(), fin)
        for line in islice(start_at, 1, None): # ignore the line still with Abstract in
            print line
Run Code Online (Sandbox Code Playgroud)


egu*_*aio 5

对我来说,下面的代码更容易理解。

with open(file_name, 'r') as f:
    while not 'Abstract' in next(f):
        pass
    for line in f:
        #line will be now the next line after the one that contains 'Abstract'
Run Code Online (Sandbox Code Playgroud)