在 Python 中附加 Pickle 文件

Question

在 Python 中附加 Pickle 文件

Sha*_*yan 5 python pickle dataframe pandas

我有 100 个数据帧（格式完全相同）作为 100 个 pickle 文件保存在我的磁盘上。这些数据帧的长度大约为 250,000 行。我想将所有 100 个数据帧保存在 1 个数据帧中，并将其作为 1 个 pickle 文件保存在磁盘上。

这就是我到目前为止正在做的事情：

path = '/Users/srayan/Desktop/MyData/Pickle'
df = pd.DataFrame()
for filename in glob.glob(os.path.join(path, '*.pkl')):
    newDF = pd.read_pickle(filename)
    df = df.append(newDF)
df.to_pickle("/Users/srayan/Desktop/MyData/Pickle/MergedPickle.pkl")

Run Code Online (Sandbox Code Playgroud)

我知道pickle会序列化数据帧，但是我是否有必要获取我的pickle文件，将其反序列化，附加数据帧，然后再次序列化它？或者有没有更快的方法来做到这一点？凭借我拥有的所有数据，我的速度越来越慢

Answer 1

jez*_*ael 1

您可以将list comprehension每个附加df到list并且仅附加一次concat：

files = glob.glob('files/*.pkl')
df = pd.concat([pd.read_pickle(fp) for fp in files], ignore_index=True)

Run Code Online (Sandbox Code Playgroud)

什么是相同的：

dfs = []
for filename in glob.glob('files/*.pkl'):
    newDF = pd.read_pickle(filename)
    dfs.append(newDF)
df = pd.concat(dfs, ignore_index=True)

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，3 月前
查看次数：	10852 次
最近记录：	4 年，4 月前