如何对字典(在运行时)进行类型提示/类型检查以查找任意数量的任意键/值对?

s-m*_*m-e 8 python dictionary type-hinting python-typing

我定义一个类如下:

from numbers import Number
from typing import Dict

from typeguard import typechecked

Data = Dict[str, Number]

@typechecked
class Foo:
    def __init__(self, data: Data):
        self._data = dict(data)
    @property
    def data(self) -> Data:
        return self._data
Run Code Online (Sandbox Code Playgroud)

我在用typeguard。我的目的是限制可以进入数据字典的类型。显然,typeguard如果将整个字典传递给函数或从函数返回,则会检查整个字典。如果字典直接“暴露”,那么检查类型就成为字典的“责任”——这显然是行不通的:

bar = Foo({'x': 2, 'y': 3}) # ok

bar = Foo({'x': 2, 'y': 3, 'z': 'not allowed'}) # error as expected

bar.data['z'] = 'should also be not allowed but still is ...' # no error, but should cause one
Run Code Online (Sandbox Code Playgroud)

PEP 589引入了类型化字典,但是针对一组固定的键(类似于struct其他语言中的 -like 结构)。相反,我需要它来获得灵活数量的任意键。

我最好的坏主意是采用“老派”:子类化dict并重新实现 API 的每一个位,数据可以通过这些 API 进出字典,并向它们添加类型检查:

@typechecked
class TypedDict(dict): # just a sketch
    def __init__(
        self,
        other: Union[Data, None] = None,
        **kwargs: Number,
    ):
        pass # TODO
    def __setitem__(self, key: str, value: Number):
        pass # TODO
    # TODO
Run Code Online (Sandbox Code Playgroud)

是否有不需要“老派”方法的有效替代方案?

Ale*_*ood 6

你的问题似乎有几个部分。


(1) 在运行时创建类型检查字典


正如 @juanpa.arrivillaga 在评论中所说,这与类型检查有关,但似乎与类型提示没有任何关系。然而,设计自己的自定义类型检查数据结构相当简单。您可以使用以下方法执行此操作collections.UserDict

from collections import UserDict
from numbers import Number

class StrNumberDict(UserDict):
    def __setitem__(self, key, value):
        if not isinstance(key, str):
            raise TypeError(
                f'Invalid type for dictionary key: '
                f'expected "str", got "{type(key).__name__}"'
            )
        if not isinstance(value, Number):
            raise TypeError(
                f'Invalid type for dictionary value: '
                f'expected "Number", got "{type(value).__name__}"'
            )
        super().__setitem__(key, value)

Run Code Online (Sandbox Code Playgroud)

使用中:

>>> d = StrNumberDict()
>>> d['foo'] = 5
>>> d[5] = 6
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 5, in __setitem__
TypeError: Invalid type for dictionary key: expected "str", got "int"
>>> d['bar'] = 'foo'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 10, in __setitem__
TypeError: Invalid type for dictionary value: expected "Number", got "str"
Run Code Online (Sandbox Code Playgroud)

如果你想概括这种事情,你可以这样做:

from collections import UserDict

class TypeCheckedDict(UserDict):
    def __init__(self, key_type, value_type, initdict=None):
        self._key_type = key_type
        self._value_type = value_type
        super().__init__(initdict)

    def __setitem__(self, key, value):
        if not isinstance(key, self._key_type):
            raise TypeError(
                f'Invalid type for dictionary key: '
                f'expected "{self._key_type.__name__}", '
                f'got "{type(key).__name__}"'
            )
        if not isinstance(value, self._value_type):
            raise TypeError(
                f'Invalid type for dictionary value: '
                f'expected "{self._value_type.__name__}", '
                f'got "{type(value).__name__}"'
            )
        super().__setitem__(key, value)
Run Code Online (Sandbox Code Playgroud)

使用中:

>>> from numbers import Number
>>> d = TypeCheckedDict(key_type=str, value_type=Number, initdict={'baz': 3.14})
>>> d['baz']
3.14
>>> d[5] = 5
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 9, in __setitem__
TypeError: Invalid type for dictionary key: expected "str", got "int"
>>> d['foo'] = 'bar'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 15, in __setitem__
TypeError: Invalid type for dictionary value: expected "Number", got "str"
>>> d['foo'] = 5
>>> d['foo']
5
Run Code Online (Sandbox Code Playgroud)

请注意,您不需要对传递给的字典进行类型检查super().__init__()UserDict.__init__调用self.__setitem__,您已经覆盖了它,因此如果您将无效的字典传递给TypeCheckedDict.__init__,您会发现引发异常,就像您在构建字典后尝试向字典添加无效的键或值一样:

>>> from numbers import Number
>>> d = TypeCheckedDict(str, Number, {'foo': 'bar'})
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 5, in __init__
  line 985, in __init__
    self.update(dict)
  line 842, in update
    self[key] = other[key]
  File "<string>", line 16, in __setitem__
TypeError: Invalid type for dictionary value: expected "Number", got "str"
Run Code Online (Sandbox Code Playgroud)

UserDict是专门为以这种方式轻松进行子类化而设计的,这就是为什么在这种情况下它是比dict.

如果你想添加类型提示TypeCheckedDict,你可以这样做:

from collections import UserDict
from collections.abc import Mapping, Hashable
from typing import TypeVar, Optional

K = TypeVar('K', bound=Hashable)
V = TypeVar('V')

class TypeCheckedDict(UserDict[K, V]):
    def __init__(
        self, 
        key_type: type[K], 
        value_type: type[V], 
        initdict: Optional[Mapping[K, V]] = None
    ) -> None:
        self._key_type = key_type
        self._value_type = value_type
        super().__init__(initdict)

    def __setitem__(self, key: K, value: V) -> None:
        if not isinstance(key, self._key_type):
            raise TypeError(
                f'Invalid type for dictionary key: '
                f'expected "{self._key_type.__name__}", '
                f'got "{type(key).__name__}"'
            )
        if not isinstance(value, self._value_type):
            raise TypeError(
                f'Invalid type for dictionary value: '
                f'expected "{self._value_type.__name__}", '
                f'got "{type(value).__name__}"'
            )
        super().__setitem__(key, value)
Run Code Online (Sandbox Code Playgroud)

(上面通过了 MyPy。)

但请注意,添加类型提示与该数据结构在运行时的工作方式完全无关。


(2) 类型提示字典“用于灵活数量的任意键”


我不太确定你的意思,但如果你希望 MyPy 在向字典添加字符串值时引发错误,而你只想拥有数值,你可以这样做

from typing import SupportsFloat

d: dict[str, SupportsFloat] = {}
d['a'] = 5  # passes MyPy 
d['b'] = 4.67 # passes MyPy
d[5] = 6 # fails MyPy
d['baz'] = 'foo' # fails Mypy 
Run Code Online (Sandbox Code Playgroud)

如果您想要 MyPy 静态检查运行时检查,您最好的选择(在我看来)是使用TypeCheckedDict上面的类型提示版本:

d = TypeCheckedDict(str, SupportsFloat) # type: ignore[misc]
d['a'] = 5  # passes MyPy 
d['b'] = 4.67  # passes MyPy 
d[5] = 6  # fails Mypy 
d['baz'] = 'foo'  # fails Mypy
Run Code Online (Sandbox Code Playgroud)

Mypy 对我们将抽象类型作为参数传递给 不太满意TypeCheckedDict.__init__,因此您必须# type: ignore[misc]在实例化字典时添加 a 。(这对我来说就像是 MyPy 的 bug。)然而除此之外,它工作得很好

(请参阅我之前的答案,了解有关使用SupportsFloat提示数字类型的注意事项。如果您使用的是 Python <= 3.8,请使用typing.Dict而不是dict用于类型提示。)


(3) 使用typeguard


由于您正在使用,您可以稍微typeguard简化我的类中的逻辑,如下所示:StrNumberDict

from collections import UserDict
from typeguard import typechecked
from typing import SupportsFloat

class StrNumberDict(UserDict[str, SupportsFloat]):
    @typechecked
    def __setitem__(self, key: str, value: SupportsFloat) -> None:
        super().__setitem__(key, value)
Run Code Online (Sandbox Code Playgroud)

typeguard但是,如果您想要一个TypeCheckedDict可以通过任意类型检查实例化的更通用的方法,我认为没有办法做到这一点。以下不起作用

### THIS DOES NOT WORK ###

from typing import TypeVar, SupportsFloat
from collections.abc import Hashable
from collections import UserDict
from typeguard import typechecked

K = TypeVar('K', bound=Hashable)
V = TypeVar('V')

class TypeCheckedDict(UserDict[K, V]):
    @typechecked
    def __setitem__(self, key: K, value: V) -> None:
        super().__setitem__(key, value)

d = TypeCheckedDict[str, SupportsFloat]()
d[5] = 'foo'  # typeguard raises no error here.
Run Code Online (Sandbox Code Playgroud)

还值得注意的是,当前没有维护typeguard ,因此使用该特定库存在一定的风险。