Python最快的读取大文本文件的方法(几GB)

Question

Python最快的读取大文本文件的方法(几GB)

Gia*_*ear 27 python optimization performance line chunking

我有一个大文本文件(约7 GB).我正在寻找是否存在阅读大文本文件的最快方法.我一直在阅读有关使用多种方法作为读取chunk-by-chunk以加快进程的过程.

例如,effbot建议

# File: readline-example-3.py

file = open("sample.txt")

while 1:
    lines = file.readlines(100000)
    if not lines:
        break
    for line in lines:
        pass # do something**strong text**

Run Code Online (Sandbox Code Playgroud)

为了每秒处理96,900行文本.其他作者建议使用islice()

from itertools import islice

with open(...) as f:
    while True:
        next_n_lines = list(islice(f, n))
        if not next_n_lines:
            break
        # process next_n_lines

Run Code Online (Sandbox Code Playgroud)

list(islice(f, n))将返回n文件的下一行列表f.在循环中使用它将为您提供大量n行的文件

Answer 1

Mor*_*sen 13

with open(<FILE>) as FileObj:
    for lines in FileObj:
        print lines # or do some other thing with the line...

Run Code Online (Sandbox Code Playgroud)

将在当前读取一行到内存,并在完成时关闭文件...

不,读得太快了...... (6认同)
Morten逐行变得太慢了. (3认同)

归档时间：	13 年前
查看次数：	77769 次
最近记录：	13 年前