Jon*_*Jon 53 python validation yaml
XML的一个好处是能够针对XSD验证文档.YAML没有此功能,那么如何验证我打开的YAML文档是否符合我的应用程序所需的格式?
Jac*_*lly 33
鉴于JSON和YAML是非常相似的野兽,您可以使用JSON-Schema来验证YAML的相当大的子集.这是一个代码片段(你需要安装PyYAML和jsonschema):
from jsonschema import validate
import yaml
schema = """
type: object
properties:
testing:
type: array
items:
enum:
- this
- is
- a
- test
"""
good_instance = """
testing: ['this', 'is', 'a', 'test']
"""
validate(yaml.load(good_instance), yaml.load(schema)) # passes
# Now let's try a bad instance...
bad_instance = """
testing: ['this', 'is', 'a', 'bad', 'test']
"""
validate(yaml.load(bad_instance), yaml.load(schema))
# Fails with:
# ValidationError: 'bad' is not one of ['this', 'is', 'a', 'test']
#
# Failed validating 'enum' in schema['properties']['testing']['items']:
# {'enum': ['this', 'is', 'a', 'test']}
#
# On instance['testing'][3]:
# 'bad'
Run Code Online (Sandbox Code Playgroud)
这样做的一个问题是,如果你的架构跨越多个文件而你"$ref"
用来引用其他文件,那么那些其他文件需要是JSON,我想.但是可能有办法解决这个问题.在我自己的项目中,我正在使用JSON文件指定模式,而实例是YAML.
Lio*_*ior 12
试试Rx,它有一个Python实现.它适用于JSON和YAML.
来自Rx网站:
"在为您的Web服务添加API时,您必须选择如何对您在整个生产线上发送的数据进行编码.XML是一种常见的选择,但它可以很快变得神秘和繁琐.很多网络服务作者都想避免考虑XML,而是选择提供一些简单数据类型的格式,这些数据类型对应于现代编程语言中的常见数据结构.换句话说,JSON和YAML.不幸的是,虽然这些格式可以轻松传递复杂的数据结构,但它们缺少验证系统.XML具有XML Schema和RELAX NG,但这些标准很复杂,有时令人困惑.它们对JSON提供的数据结构不是很容易移植,如果你想避免将XML作为数据编码,那么编写更多的XML来验证第一个XML可能就不那么吸引人了.
Rx旨在提供一个与JSON样式数据结构相匹配的数据验证系统,并且与JSON本身一样易于使用."
Una*_*dra 12
尚未提及Pydantic 。
从他们的例子来看:
from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel
class User(BaseModel):
id: int
name = 'John Doe'
signup_ts: Optional[datetime] = None
friends: List[int] = []
# Parse your YAML into a dictionary, then validate against your model.
external_data = {
'id': '123',
'signup_ts': '2019-06-01 12:22',
'friends': [1, 2, '3'],
}
user = User(**external_data)
Run Code Online (Sandbox Code Playgroud)
是的 - 支持验证对于许多重要的用例至关重要.参见例如YAML和模式验证的重要性«Stuart Gunter
如前所述,Rx适用于各种语言,Kwalify适用于Ruby和Java.
另见PyYAML讨论:YAMLSchemaDiscussion.
一个相关的工作是JSON Schema,它甚至有一些IETF标准化活动(draft-zyp-json-schema-03 - 用于描述JSON文档的结构和含义的JSON媒体类型)
您可以将 YAML 文档作为dict加载并使用库架构来检查它:
from schema import Schema, And, Use, Optional, SchemaError
import yaml
schema = Schema(
{
'created': And(datetime.datetime),
'author': And(str),
'email': And(str),
'description': And(str),
Optional('tags'): And(str, lambda s: len(s) >= 0),
'setup': And(list),
'steps': And(list, lambda steps: all('=>' in s for s in steps), error='Steps should be array of string '
'and contain "=>" to separate'
'actions and expectations'),
'teardown': And(list)
}
)
with open(filepath) as f:
data = yaml.load(f)
try:
schema.validate(data)
except SchemaError as e:
print(e)
Run Code Online (Sandbox Code Playgroud)
这些看起来不错.yaml解析器可以处理语法错误,其中一个库可以验证数据结构.
I am into same situation. I need to validate the elements of YAML.
First I thought 'PyYAML tags' is the best and simple way. But later decided to go with 'PyKwalify' which actually defines a schema for YAML.
The YAML file has a tag support where we can enforce this basic checks by prefixing the data type. (e.g) For integer - !!int "123"
More on PyYAML: http://pyyaml.org/wiki/PyYAMLDocumentation#Tags This is good, but if you are going to expose this to the end user, then it might cause confusion. I did some research to define a schema of YAML. The idea is like we can validate the YAML with its corresponding schema for basic data type check. Also even our custom validations like IP address, random strings can be added in this. so we can have our schema separately leaving YAML simple and readable.
I am unable to post more links. Please 'google schema for YAM'L to view the schema discussions.
There is a package called PyKwalify which serves this purpose: https://pypi.python.org/pypi/pykwalify
This package best fits my requirements. I tried this with a small example in my local set up, and is working. Heres the sample schema file.
#sample schema
type: map
mapping:
Emp:
type: map
mapping:
name:
type: str
required: yes
email:
type: str
age:
type: int
birth:
type: str
Run Code Online (Sandbox Code Playgroud)
Valid YAML file for this schema
---
Emp:
name: "abc"
email: "xyz@gmail.com"
age: yy
birth: "xx/xx/xxxx"
Run Code Online (Sandbox Code Playgroud)
Thanks
我发现Cerberus具有出色的文档记录并且易于使用,非常可靠。
这是一个基本的实现示例:
my_yaml.yaml
:
name: 'my_name'
date: 2017-10-01
metrics:
percentage:
value: 87
trend: stable
Run Code Online (Sandbox Code Playgroud)
在中定义验证模式schema.py
:
{
'name': {
'required': True,
'type': 'string'
},
'date': {
'required': True,
'type': 'date'
},
'metrics': {
'required': True,
'type': 'dict',
'schema': {
'percentage': {
'required': True,
'type': 'dict',
'schema': {
'value': {
'required': True,
'type': 'number',
'min': 0,
'max': 100
}
'trend': {
'type': 'string',
'nullable': True,
'regex': '^(?i)(down|equal|up)$'
}
}
}
}
}
}
Run Code Online (Sandbox Code Playgroud)
使用PyYaml加载yaml
文档:
def __load_doc():
with open(__yaml_path, 'r') as stream:
try:
return yaml.load(stream)
except yaml.YAMLError as exception:
raise exception
Run Code Online (Sandbox Code Playgroud)
评估yaml文件非常简单:
schema = eval(open('PATH_TO/schema.py', 'r').read())
v = Validator(schema)
doc = __load_doc()
print v.validate(doc, schema)
print v.errors
Run Code Online (Sandbox Code Playgroud)
请记住,Cerberus是一个不可知的数据验证工具,这意味着它可以支持YAML以外的其他格式,例如JSON,XML等。