我有一个大的hdf5文件,看起来像这样:
A/B/dataset1, dataset2
A/C/dataset1, dataset2
A/D/dataset1, dataset2
A/E/dataset1, dataset2
Run Code Online (Sandbox Code Playgroud)
...
我想创建一个仅包含以下内容的新文件:A/B/dataset1,dataset2 A/C/dataset1,dataset2
python中最简单的方法是什么?
我做了:
fs = h5py.File('source.h5', 'r')
fd = h5py.File('dest.h5', 'w')
fs.copy('group B', fd)
Run Code Online (Sandbox Code Playgroud)
问题是我得到了dest.h5:
B/dataset1, dataset2
Run Code Online (Sandbox Code Playgroud)
并且我遗漏了一部分树枝.
Yos*_*ian 23
fs.copy('A/B', fd)路径不复制/A/B/到fd,它只是复制组B(如你发现了!).所以你首先需要创建路径的其余部分:
fd.create_group('A')
fs.copy('A/B', fd['/A'])
Run Code Online (Sandbox Code Playgroud)
或者,如果您将使用该组很多:
fd_A = fd.create_group('A')
fs.copy('A/B', fd_A)
Run Code Online (Sandbox Code Playgroud)
这将复制组B从fs['/A/B']成fd['/A']:
In [1]: fd['A/B'].keys()
Out[1]: [u'dataset1', u'dataset2']
Run Code Online (Sandbox Code Playgroud)
这是一种自动执行此操作的方法:
# Get the name of the parent for the group we want to copy
group_path = fs['/A/B'].parent.name
# Check that this group exists in the destination file; if it doesn't, create it
# This will create the parents too, if they don't exist
group_id = fd.require_group(group_path)
# Copy fs:/A/B/ to fd:/A/G
fs.copy('/A/B', group_id, name="G")
print(fd['/A/G'].keys())
# [u'dataset1', u'dataset2']
Run Code Online (Sandbox Code Playgroud)