python将所有键转换为字符串

Clo*_*ave 2 python json dictionary

我有嵌套的字典,有一些数字键。我需要将此字典存储为 JSON,并且由于这些键是数字,因此我无法将它们存储为 JSON。我写了下面的代码,但它给了我一个错误,说字典的长度已经改变(RuntimeError: dictionary changed size during iteration)。

def convert_to_str(dictionary):
    for key in dictionary:
        print (key)
        found = False
        non_str_keys = []
        if not isinstance(key, str):
            print(key, 'is not a string')
            dictionary[str(key)] = dictionary[key]
            non_str_keys.append(key)
        if isinstance(dictionary[str(key)], dict):
            dictionary[str(key)] = convert_to_str(dictionary[str(key)])
            non_str_keys.append(key)
        if non_str_keys:
            for each_non_str_key in non_str_keys:
                del dictionary[each_non_str_key]
    return dictionary
Run Code Online (Sandbox Code Playgroud)

我如何避免这种情况?我的字典是——

a = {
  "age": {
    1: 25.0,
    2: 50.25,
    3: 50.0,
    4: 75.0,
    5: 14.580906789680968,
    6: [
      25.0,
      30.0,
      34.800000000000004,
      40.0,
      46.60000000000001,
      50.0,
      56.0,
      61.0,
      65.0,
      69.0,
      75.0
    ],
    "quartiles": [
      38.0,
      64.0
    ],
    "decile_event_rate": [
      0.8125,
      0.7142857142857143,
      0.65625,
      0.42857142857142855,
      0.45161290322580644,
      0.4857142857142857,
      0.5925925925925926,
      0.5,
      0.5142857142857142,
      0.375
    ]
  },
  "income": {
    "min": 10198.0,
    "mean": 55621.78666666667,
    "median": 52880.0,
    "max": 99783.0,
    "std": 24846.911384024643,
    "deciles": [
      10198.0,
      25269.4,
      31325.800000000003,
      37857.0,
      43721.8,
      52880.0,
      63996.0,
      72526.9,
      82388.2,
      89765.90000000001,
      99783.0
    ],
    "quartiles": [
      35088.5,
      78687.25
    ],
    "decile_event_rate": [
      0.6666666666666666,
      0.6,
      0.5333333333333333,
      0.5666666666666667,
      0.5,
      0.6451612903225806,
      0.4827586206896552,
      0.5,
      0.5666666666666667,
      0.5
    ]
  },
  "edu_yrs": {
    "min": 0.0,
    "mean": 12.73,
    "median": 13.0,
    "max": 25.0,
    "std": 7.86234623342895,
    "deciles": [
      0.0,
      2.0,
      4.0,
      7.0,
      9.600000000000009,
      13.0,
      16.0,
      18.0,
      21.200000000000017,
      23.0,
      25.0
    ],
    "quartiles": [
      6.0,
      20.0
    ],
    "decile_event_rate": [
      0.5384615384615384,
      0.6521739130434783,
      0.5151515151515151,
      0.48,
      0.6111111111111112,
      0.5,
      0.5,
      0.6071428571428571,
      0.5151515151515151,
      0.6666666666666666
    ]
  },
  "yrs_since_exercise": {
    "min": 0.0,
    "mean": 18.566666666666666,
    "median": 16.0,
    "max": 60.0,
    "std": 14.417527732194037,
    "deciles": [
      0.0,
      3.0,
      5.0,
      8.0,
      12.0,
      16.0,
      20.0,
      25.0,
      31.0,
      41.0,
      60.0
    ],
    "quartiles": [
      6.0,
      27.0
    ],
    "decile_event_rate": [
      1.0,
      1.0,
      1.0,
      0.9629629629629629,
      0.75,
      0.4857142857142857,
      0.15384615384615385,
      0.06666666666666667,
      0.0,
      0.0
    ]
  },
  "security_label": {
    "event_rate": {
      "A": {
        "1.0": 0.6,
        "0.0": 0.4
      },
      "B": {
        "1.0": 0.57,
        "0.0": 0.43
      },
      "C": {
        "0.0": 0.5,
        "1.0": 0.5
      }
    },
    "freq": {
      "A": 100,
      "B": 100,
      "C": 100
    },
    "var_type": "categorical"
  }
}
Run Code Online (Sandbox Code Playgroud)

编辑

    json.dump(self.entity_data, open(path, 'w'), indent=2, cls=CustomEncoder)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 179, in dump
    for chunk in iterable:
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 430, in _iterencode
    yield from _iterencode_dict(o, _current_indent_level)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
    yield from chunks
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
    yield from chunks
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 404, in _iterencode_dict
    yield from chunks
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 376, in _iterencode_dict
    raise TypeError("key " + repr(key) + " is not a string")
TypeError: key 0 is not a string
Run Code Online (Sandbox Code Playgroud)

添加错误图像在此处输入图片说明

编辑-2

我在使用 numpy 对象时遇到了序列化错误 ebfore。所以我开始,使用这个编码器将它们转换为 python 对象。

class CustomEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        else:
            return super(CustomEncoder, self).default(obj)
Run Code Online (Sandbox Code Playgroud)

我一直json.dump在使用cls = CustomEncoder. 这是我用过的命令

 json.dump(self.entity_data, open(path, 'w'), indent=2, cls=CustomEncoder)
Run Code Online (Sandbox Code Playgroud)

Mar*_*ers 5

您需要递归转换所有键;生成具有 dict 理解的字典,这比就地更改键容易得多。您不能在正在迭代的字典中添加字符串键和删除非字符串键,因为这会改变哈希表,这很容易改变字典键的列出顺序,所以这是不允许的。

你不应该忘记处理列表;它们也可以包含更多的字典。

每当我需要转换这样的嵌套结构时,我都会使用@functools.singledispatch装饰器将不同容器类型的处理拆分为不同的功能:

from functools import singledispatch

@singledispatch
def keys_to_strings(ob):
    return ob

@keys_to_strings.register
def _handle_dict(ob: dict):
    return {str(k): keys_to_strings(v) for k, v in ob.items()}

@keys_to_strings.register
def _handle_list(ob: list):
    return [keys_to_strings(v) for v in ob]
Run Code Online (Sandbox Code Playgroud)

然后 JSON 编码的结果keys_to_string()

json.dumps(keys_to_string(a))
Run Code Online (Sandbox Code Playgroud)

并不是说这都是需要的json.dumps()本机接受整数键,将它们转换为字符串。您的输入示例无需转换即可工作

json.dumps(a)
Run Code Online (Sandbox Code Playgroud)

json.dumps()文档

注意: JSON 的键/值对中的键始终是 类型str。当字典转换为 JSON 时,字典的所有键都被强制转换为字符串。因此,如果将字典转换为 JSON,然后再转换回字典,则该字典可能不等于原始字典。也就是说,loads(dumps(x)) != x如果x有非字符串键。

这仅适用于JSON 本来可以处理的类型,所以None,布尔值floatint对象。对于其他任何事情,你仍然会得到你的例外。您可能有一个表示为的对象0,但它不是 Python int0:

>>> json.dumps({0: 'works'})
'{"0": "works"}'
>>> import numpy
>>> numpy.int32()
0
>>> json.dumps({numpy.int32(): 'fails'})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/Users/mjpieters/Development/Library/buildout.python/parts/opt/lib/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
TypeError: keys must be a string
Run Code Online (Sandbox Code Playgroud)

我选择了一个numpy整数类型,因为这是一个常见的混淆整数值,而不是 Python int

您添加到帖子中的自定义编码器不会用于密钥;这只适用于字典中的,所以如果你有非标准的键对象,那么你确实仍然需要使用上面的递归解决方案。