相关疑难解决方法(0)

有时我从我的 HDF5 文件中得到以下数组：

val1 = {ndarray} [<HDF5 object reference> <HDF5 object reference> <HDF5 object reference>]

如果我尝试使用 HDF5 文件对象取消引用它

f[val[0]]

我收到一个错误

Argument 'ref' has incorrect type (expected h5py.h5r.Reference, got numpy.object_)

6
推荐指数

1
解决办法

2233
查看次数

我有一堆 hdf5 文件，我想将其中的一些数据转换为 parquet 文件。不过，我正在努力将它们读入 pandas/pyarrow 中。我认为这与文件最初创建的方式有关。

如果我使用 h5py 打开文件，数据看起来完全符合我的预期。

import h5py

file_path = "/data/some_file.hdf5"
hdf = h5py.File(file_path, "r")
print(list(hdf.keys()))

给我

>>> ['foo', 'bar', 'baz']

在本例中，我对“bar”组感兴趣，其中包含 3 个项目。

如果我尝试读取使用中的数据，HDFStore我将无法访问任何组。

>>> ['foo', 'bar', 'baz']

那么该HDFStore对象就没有键或组。

import pandas as pd

file_path = "/data/some_file.hdf5"
store = pd.HDFStore(file_path, "r")

如果我尝试访问数据，则会收到以下错误

assert not store.groups()
assert not store.keys()

TypeError: cannot create a storer if the object is not existing nor a value are passed

同样，如果我尝试使用pd.read_hdf它看起来文件是空的。

bar = store.get("/bar")

ValueError: Dataset(s) …

4
推荐指数

1
解决办法

4453
查看次数