python - 如何正确使用 asyncio 并使用 pandas 读取 csv

我的路径中有很多csv文件，我希望使用pandas read_csv来读取，然后使用pandas.concat合并所有返回的dataframe，

但我认为我没有正确使用asyncio，因为消耗的时间并没有缩短。

import asyncio
import time
import pandas as pd
import glob2
import os

async def read_csv(filename):
    df = pd.read_csv(filename, header=None)
    return df
t = time.time()
path = r'C:\LRM_STGY_REPO\IB_IN'

tasks = [asyncio.ensure_future(read_csv(i)) for i in list(glob2.iglob(os.path.join(path, "*.txt")))]

loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait(tasks))

df = pd.concat([t.result() for t in tasks],ignore_index=True)
# print(df)
print( '%.4f' %(time.time()-t))

t = time.time()
def read_csv2(filename):
    return pd.read_csv(filename, header=None)
df = pd.concat(map(read_csv2,glob2.iglob(os.path.join(path, "*.txt"))),ignore_index=True)
# print(df)
print( '%.4f' %(time.time()-t))

Run Code Online (Sandbox Code Playgroud)

read_csv 和 read_csv2 的消耗时间相似。

或者还有其他方法来减少连接时间。

归档时间：	7 年，10 月前
查看次数：	2482 次
最近记录：	7 年，10 月前