如何在 python-jsonschema 文档中设置本地文件引用?

Chr*_* W. 14 python json jsonschema python-jsonschema

我有一组符合jsonschema 的文档。某些文档包含对其他文档的引用(通过$ref属性)。我不希望托管这些文档,以便它们可以通过 HTTP URI 访问。因此,所有引用都是相对的。所有文档都位于本地文件夹结构中。

我怎样才能python-jsonschema理解正确使用我的本地文件系统来加载引用的文档?


例如,如果我有一个文件名defs.json包含一些定义的文档。我尝试加载引用它的不同文档,例如:

{
  "allOf": [
    {"$ref":"defs.json#/definitions/basic_event"},
    {
      "type": "object",
      "properties": {
        "action": {
          "type": "string",
          "enum": ["page_load"]
        }
      },
      "required": ["action"]
    }
  ]
}
Run Code Online (Sandbox Code Playgroud)

我收到一个错误 RefResolutionError: <urlopen error [Errno 2] No such file or directory: '/defs.json'>

我在 linux 机器上可能很重要。


(我写这个作为问答,因为我很难弄清楚这一点,并且观察到其他人 也遇到了麻烦。)

Dan*_*der 13

我最难弄清楚如何解决一组$ref彼此的模式(我是 JSON 模式的新手)。原来,关键是要创造RefResolver一个store这是一个dict从URL映射到架构。基于@devin-p 的回答:

import json

from jsonschema import RefResolver, Draft7Validator

base = """
{
  "$id": "base.schema.json",
  "type": "object",
  "properties": {
    "prop": {
      "type": "string"
    }
  },
  "required": ["prop"]
}
"""

extend = """
{  
  "$id": "extend.schema.json",
  "allOf": [
    {"$ref": "base.schema.json#"},
    {
      "properties": {
        "extra": {
          "type": "boolean"
        }
      },
    "required": ["extra"]
    }
  ]
}
"""

extend_extend = """
{
  "$id": "extend_extend.schema.json",
  "allOf": [
    {"$ref": "extend.schema.json#"},
    {
      "properties": {
        "extra2": {
          "type": "boolean"
        }
      },
    "required": ["extra2"]
    }
  ]
}
"""

data = """
{
"prop": "This is the property string",
"extra": true,
"extra2": false
}
"""

schema = json.loads(base)
extendedSchema = json.loads(extend)
extendedExtendSchema = json.loads(extend_extend)
schema_store = {
    schema['$id'] : schema,
    extendedSchema['$id'] : extendedSchema,
    extendedExtendSchema['$id'] : extendedExtendSchema,
}


resolver = RefResolver.from_schema(schema, store=schema_store)
validator = Draft7Validator(extendedExtendSchema, resolver=resolver)

jsonData = json.loads(data)
validator.validate(jsonData)
Run Code Online (Sandbox Code Playgroud)

上面是用jsonschema==3.2.0.


Chr*_* W. 6

您必须jsonschema.RefResolver为每个使用相对引用的模式构建自定义,并确保您的解析器知道给定模式在文件系统上的位置。

如...

import os
import json
from jsonschema import Draft4Validator, RefResolver # We prefer Draft7, but jsonschema 3.0 is still in alpha as of this writing 


abs_path_to_schema = '/path/to/schema-doc-foobar.json'
with open(abs_path_to_schema, 'r') as fp:
  schema = json.load(fp)

resolver = RefResolver(
  # The key part is here where we build a custom RefResolver 
  # and tell it where *this* schema lives in the filesystem
  # Note that `file:` is for unix systems
  schema_path='file:{}'.format(abs_path_to_schema),
  schema=schema
)
Draft4Validator.check_schema(schema) # Unnecessary but a good idea
validator = Draft4Validator(schema, resolver=resolver, format_checker=None)

# Then you can...
data_to_validate = `{...}`
validator.validate(data_to_validate)
Run Code Online (Sandbox Code Playgroud)


Bra*_*ndt 5

编辑-1

修复了对架构的错误引用 ( $ref) base。更新了示例以使用文档中的示例 https: //json-schema.org/understanding-json-schema/structuring.html

编辑2

正如评论中指出的,在下面我使用以下导入:

from jsonschema import validate, RefResolver 
from jsonschema.validators import validator_for
Run Code Online (Sandbox Code Playgroud)

这只是@Daniel 答案的另一个版本——这对我来说是正确的。$schema基本上,我决定在基本模式中定义。然后释放其他模式并在实例化解析器时进行明确的调用。

  • 事实上,RefResolver.from_schema()获取(1)一些模式以及(2)一个模式存储对我来说并不是很清楚顺序和哪些“一些”模式在这里是否相关。所以你在下面看到的结构。

我有以下内容:

base.schema.json:

from jsonschema import validate, RefResolver 
from jsonschema.validators import validator_for
Run Code Online (Sandbox Code Playgroud)

definitions.schema.json:

{
  "$schema": "http://json-schema.org/draft-07/schema#"
}
Run Code Online (Sandbox Code Playgroud)

address.schema.json:

{
  "type": "object",
  "properties": {
    "street_address": { "type": "string" },
    "city":           { "type": "string" },
    "state":          { "type": "string" }
  },
  "required": ["street_address", "city", "state"]
}
Run Code Online (Sandbox Code Playgroud)

我喜欢这个设置有两个原因:

  1. 是一个更干净的调用RefResolver.from_schema()

    {
      "type": "object",
    
      "properties": {
        "billing_address": { "$ref": "definitions.schema.json#" },
        "shipping_address": { "$ref": "definitions.schema.json#" }
      }
    }
    
    Run Code Online (Sandbox Code Playgroud)
  2. 然后,我受益于该库提供的方便工具,为您提供最好 validator_for模式(根据您的$schema密钥):

    base = json.loads(open('base.schema.json').read())
    definitions = json.loads(open('definitions.schema.json').read())
    schema = json.loads(open('address.schema.json').read())
    
    schema_store = {
      base.get('$id','base.schema.json') : base,
      definitions.get('$id','definitions.schema.json') : definitions,
      schema.get('$id','address.schema.json') : schema,
    }
    
    resolver = RefResolver.from_schema(base, store=schema_store)
    
    Run Code Online (Sandbox Code Playgroud)
  3. 然后将它们放在一起实例化validator

    Validator = validator_for(base)
    
    Run Code Online (Sandbox Code Playgroud)

最后,你validate的数据:

validator = Validator(schema, resolver=resolver)
Run Code Online (Sandbox Code Playgroud)
  • 这个 将会崩溃,因为"state": 32
data = {
  "shipping_address": {
    "street_address": "1600 Pennsylvania Avenue NW",
    "city": "Washington",
    "state": "DC"   
  },
  "billing_address": {
    "street_address": "1st Street SE",
    "city": "Washington",
    "state": 32
  }
}
Run Code Online (Sandbox Code Playgroud)

将其更改"DC"为并将验证

  • 这个答案对我来说非常有效。只是想指出其他人也尝试此操作的导入依赖项 ```from jsonschema import validate, RefResolver``` ```from jsonschema.validators import validator_for``` (2认同)