puz*_*let 17 python parsing yaml pyyaml
我正在使用YAML数据生成文档生成器,该数据将指定生成每个项目的YAML文件的哪一行.做这个的最好方式是什么?所以,如果YAML文件是这样的:
- key1: item 1
key2: item 2
- key1: another item 1
key2: another item 2
Run Code Online (Sandbox Code Playgroud)
我想要这样的东西:
[
{'__line__': 1, 'key1': 'item 1', 'key2': 'item 2'},
{'__line__': 3, 'key1': 'another item 1', 'key2': 'another item 2'},
]
Run Code Online (Sandbox Code Playgroud)
我目前正在使用PyYAML,但是如果我可以从Python中使用它,那么任何其他库都可以.
puz*_*let 10
我已通过添加钩使它Composer.compose_node和Constructor.construct_mapping:
import yaml
from yaml.composer import Composer
from yaml.constructor import Constructor
def main():
loader = yaml.Loader(open('data.yml').read())
def compose_node(parent, index):
# the line number where the previous token has ended (plus empty lines)
line = loader.line
node = Composer.compose_node(loader, parent, index)
node.__line__ = line + 1
return node
def construct_mapping(node, deep=False):
mapping = Constructor.construct_mapping(loader, node, deep=deep)
mapping['__line__'] = node.__line__
return mapping
loader.compose_node = compose_node
loader.construct_mapping = construct_mapping
data = loader.get_single_data()
print(data)
Run Code Online (Sandbox Code Playgroud)
以下代码基于之前的良好答案,如果有人还需要查找叶属性的行号,以下代码可能会有所帮助:
from yaml.composer import Composer
from yaml.constructor import Constructor
from yaml.nodes import ScalarNode
from yaml.resolver import BaseResolver
from yaml.loader import Loader
class LineLoader(Loader):
def __init__(self, stream):
super(LineLoader, self).__init__(stream)
def compose_node(self, parent, index):
# the line number where the previous token has ended (plus empty lines)
line = self.line
node = Composer.compose_node(self, parent, index)
node.__line__ = line + 1
return node
def construct_mapping(self, node, deep=False):
node_pair_lst = node.value
node_pair_lst_for_appending = []
for key_node, value_node in node_pair_lst:
shadow_key_node = ScalarNode(tag=BaseResolver.DEFAULT_SCALAR_TAG, value='__line__' + key_node.value)
shadow_value_node = ScalarNode(tag=BaseResolver.DEFAULT_SCALAR_TAG, value=key_node.__line__)
node_pair_lst_for_appending.append((shadow_key_node, shadow_value_node))
node.value = node_pair_lst + node_pair_lst_for_appending
mapping = Constructor.construct_mapping(self, node, deep=deep)
return mapping
if __name__ == '__main__':
stream = """ # The first line
key1: # This is the second line
key1_1: item1
key1_2: item1_2
key1_3:
- item1_3_1
- item1_3_2
key2: item 2
key3: another item 1
"""
loader = LineLoader(stream)
data = loader.get_single_data()
from pprint import pprint
pprint(data)
Run Code Online (Sandbox Code Playgroud)
输出如下,其中有另一个带有前缀“__line__”的键,与同一级别的“__line__key”相同。
PS:对于列表项,我还无法显示该行。
{'__line__key1': 2,
'__line__key2': 8,
'__line__key3': 9,
'key1': {'__line__key1_1': 3,
'__line__key1_2': 4,
'__line__key1_3': 5,
'key1_1': 'item1',
'key1_2': 'item1_2',
'key1_3': ['item1_3_1', 'item1_3_2']},
'key2': 'item 2',
'key3': 'another item 1'}
Run Code Online (Sandbox Code Playgroud)
如果您使用的是ruamel.yaml > = 0.9(我是作者),并使用RoundTripLoader,则可以访问lc集合项的属性以获取源YAML中行和列的起始位置:
def test_item_04(self):
data = load("""
# testing line and column based on SO
# http://stackoverflow.com/questions/13319067/
- key1: item 1
key2: item 2
- key3: another item 1
key4: another item 2
""")
assert data[0].lc.line == 2
assert data[0].lc.col == 2
assert data[1].lc.line == 4
assert data[1].lc.col == 2
Run Code Online (Sandbox Code Playgroud)
(行和列从0开始计数)。
此答案显示了如何lc在加载期间将属性添加到字符串类型。
这是Puzzlet答案的改进版本:
import yaml
from yaml.loader import SafeLoader
class SafeLineLoader(SafeLoader):
def construct_mapping(self, node, deep=False):
mapping = super(SafeLineLoader, self).construct_mapping(node, deep=deep)
# Add 1 so line numbering starts at 1
mapping['__line__'] = node.start_mark.line + 1
return mapping
Run Code Online (Sandbox Code Playgroud)
您可以像这样使用它:
data = yaml.load(whatever, Loader=SafeLineLoader)
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3438 次 |
| 最近记录: |