PyYaml“包含文件”和yaml别名(锚定/引用)

ben*_*ams 4 yaml cross-reference pyyaml

我有一个很大的YAML文件,其中大量使用了YAML锚点和引用,例如:

warehouse:
  obj1: &obj1
    key1: 1
    key2: 2
specific:
  spec1: 
    <<: *obj1
  spec2:
    <<: *obj1
    key1: 10
Run Code Online (Sandbox Code Playgroud)

该文件太大,因此我寻找了一种解决方案,该解决方案可以将我拆分为2个文件:warehouse.yamlspecific.yaml,并将其包含warehouse.yaml在中specific.yaml。我读了这篇简单的文章,它描述了如何使用PyYAML实现这一目标,但是它也表明不支持合并键(<<)。

我确实有一个错误:

yaml.composer.ComposerError:找到未定义的别名'obj1

当我试图那样去。

因此,我开始寻找替代方法,但由于对PyYAML的了解不多,我感到困惑。

我可以获得所需的合并键支持吗?还有其他解决方案吗?

Ant*_*hon 6

在PyYAML中处理锚和别名的关键anchorsComposer。它将锚点映射到节点,以便可以查找别名。它的存在受限于的存在Composer,后者是Loader您使用的组成元素。

Loader类只在调用的时候存在yaml.load(),所以没有琐碎的方式来算账提取这样的:首先,你将不得不作出的实例的Loader()坚持,然后确保正常的compose_document()方法不叫(除其他事情做self.anchors = {},以清洁下一个文档(在单个流中)。

如果要进一步复杂化,请执行以下操作warehouse.yaml

warehouse:
  obj1: &obj1
    key1: 1
    key2: 2
Run Code Online (Sandbox Code Playgroud)

specific.yaml

warehouse: !include warehouse.yaml
specific:
  spec1:
    <<: *obj1
  spec2:
    <<: *obj1
    key1: 10
Run Code Online (Sandbox Code Playgroud)

即使您可以保留,提取和传递锚信息,也永远无法使它与您的代码段一起使用,因为与使用specific.yaml标记!include进行构造(和填充anchors)相比,作曲者处理遇到未定义别名的时间要早​​得多。

您可以采取什么措施来避免此问题,包括 specific.yaml

specific:
  spec1:
    <<: *obj1
  spec2:
    <<: *obj1
    key1: 10
Run Code Online (Sandbox Code Playgroud)

来自warehouse.yaml

warehouse:
  obj1: &obj1
    key1: 1
    key2: 2
specific: !include specific.yaml
Run Code Online (Sandbox Code Playgroud)

,或将两者都包含在第三个文件中。请注意,密钥specific在两个文件中

运行这两个文件:

import sys
from ruamel import yaml

def my_compose_document(self):
    self.get_event()
    node = self.compose_node(None, None)
    self.get_event()
    # self.anchors = {}    # <<<< commented out
    return node

yaml.SafeLoader.compose_document = my_compose_document

# adapted from http://code.activestate.com/recipes/577613-yaml-include-support/
def yaml_include(loader, node):
    with open(node.value) as inputfile:
        return list(my_safe_load(inputfile, master=loader).values())[0]
#              leave out the [0] if your include file drops the key ^^^

yaml.add_constructor("!include", yaml_include, Loader=yaml.SafeLoader)


def my_safe_load(stream, Loader=yaml.SafeLoader, master=None):
    loader = Loader(stream)
    if master is not None:
        loader.anchors = master.anchors
    try:
        return loader.get_single_data()
    finally:
        loader.dispose()

with open('warehouse.yaml') as fp:
    data = my_safe_load(fp)
yaml.safe_dump(data, sys.stdout, default_flow_style=False)
Run Code Online (Sandbox Code Playgroud)

这使:

specific:
  spec1:
    key1: 1
    key2: 2
  spec2:
    key1: 10
    key2: 2
warehouse:
  obj1:
    key1: 1
    key2: 2
Run Code Online (Sandbox Code Playgroud)

如果您specific.yaml没有顶级密钥specific

spec1:
  <<: *obj1
spec2:
  <<: *obj1
  key1: 10
Run Code Online (Sandbox Code Playgroud)

然后将的最后一行替换为yaml_include()

return my_safe_load(inputfile, master=loader)
Run Code Online (Sandbox Code Playgroud)

上面完成了ruamel.yaml(免责声明:我是该程序包的作者),并在Python 2.7和3.6上进行了测试。通过更改导入,它也可以与PyYAML一起使用。


使用新的ruamel.yamlAPI,上述内容可以大大简化,因为loader传递给yaml_include()构造函数的对象都知道YAML实例,但是当然,您仍然需要compose_document不破坏锚点的适应对象。假设specific.yaml 不使用顶级密钥specific,则以下给出的输出与以前相同。

import sys
from ruamel.std.pathlib import Path
from ruamel.yaml import YAML, version_info

yaml = YAML(typ='safe', pure=True)
yaml.default_flow_style = False


def my_compose_document(self):
    self.parser.get_event()
    node = self.compose_node(None, None)
    self.parser.get_event()
    # self.anchors = {}    # <<<< commented out
    return node

yaml.Composer.compose_document = my_compose_document

# adapted from http://code.activestate.com/recipes/577613-yaml-include-support/
def yaml_include(loader, node):
    y = loader.loader
    yaml = YAML(typ=y.typ, pure=y.pure)  # same values as including YAML
    yaml.composer.anchors = loader.composer.anchors
    return yaml.load(Path(node.value))

yaml.Constructor.add_constructor("!include", yaml_include)

data = yaml.load(Path('warehouse.yaml'))
yaml.dump(data, sys.stdout)
Run Code Online (Sandbox Code Playgroud)