Python Pandas 持久缓存

Question

Python Pandas 持久缓存

Luc*_* C. 10 persistence caching financial pandas

是否有将数据缓存在磁盘上的 python pandas 实现，以便我可以避免每次都重现它？

特别是get_yahoo_data对于财务有没有缓存方法？

一个非常加分的是：

很少的代码行
为同一源下载新数据时集成持久系列的可能性

Answer 1

nij*_*ijm 14

有很多方法可以实现这一点，但最简单的方法可能是使用内置方法来编写和读取Python pickles。您可以使用pandas.DataFrame.to_pickle将 DataFrame 存储到磁盘并pandas.read_pickle从磁盘读取存储的 DataFrame。

一个例子pandas.DataFrame：

# Store your DataFrame
df.to_pickle('cached_dataframe.pkl') # will be stored in current directory

# Read your DataFrame
df = pandas.read_pickle('cached_dataframe.pkl') # read from current directory

Run Code Online (Sandbox Code Playgroud)

同样的方法也适用于pandas.Series：

# Store your Series
series.to_pickle('cached_series.pkl') # will be stored in current directory

# Read your DataFrame
series = pandas.read_pickle('cached_series.pkl') # read from current directory

Run Code Online (Sandbox Code Playgroud)

Answer 2

Eir*_*Lid 9

您可以使用数据缓存包。

from data_cache import pandas_cache

@pandas_cache
def foo():
    ...

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，8 月前
查看次数：	11131 次
最近记录：	4 年，9 月前