pandas Immutable DataFrame

san*_*tle 11 python immutability pandas

我感兴趣的是一个在程序中用作引用表的Immutable DataFrame,在最初构造之后强制执行read_only属性(在我的情况下是在类def __init__()方法中).

我看到索引对象已冻结.

有没有办法使整个DataFrame不可变?

fle*_*one 16

StaticFrame包(其中我是作者)实现了一个熊猫一样的界面,和许多常见的熊猫操作,而在与NumPy阵列和不可变系列和框架集装箱底层执行不变性。

您可以通过Frame使用static_frame.Frame.from_pandas(df). 然后您可以将其用作真正的只读表。

请参阅此方法的 StaticFrame 文档:https ://static-frame.readthedocs.io/en/latest/api_detail/frame.html#frame-constructor

  • 干得好!在什么级别强制执行不变性?我想知道我们可以在哪里期待开销,特别是使用 StaticFrames 为 sklearn 提供 numpy 数组而无需复制。 (2认同)
  • 谢谢你!不变性在 NumPy 数组级别强制执行:由于 NumPy 数组实现了缓冲区协议 (PEP 3118),我们可以设置“array.flags.writeable”布尔属性来强制不变性。我希望与可变数组相比,这不会增加额外的开销。StaticFrame 利用不可变数组使“Series”和“Frame”上的许多操作快速且轻量级,因为不需要复制底层数组。此终端动画演示了这一好处:https://raw.githubusercontent.com/InvestmentSystems/static-frame/master/doc/images/animate-low-memory-ops-verbose.svg (2认同)

Joo*_*oop 5

试试像这样的代码

class Bla(object):
    def __init__(self):
        self._df = pd.DataFrame(index=[1,2,3])

    @property
    def df(self):
        return self._df.copy()
Run Code Online (Sandbox Code Playgroud)

这将允许您使用b.df返回df,但您将无法分配它.所以简而言之,你在类中的df表现在"Immutable DataFrame"中,纯粹是因为它阻止了对原始文件的更改.但是,返回的对象仍然是一个可变数据框,因此在其他方面它不会像一个不可变的数据框.即你将无法使用它作为字典等的关键.


dei*_*aur 5

如果您确实想让DataFrame行为不可变,而不是使用copy@Joop 的解决方案(我推荐),您可以构建以下结构。

请注意,这只是一个起点。

它基本上是一个代理数据对象,隐藏所有会改变状态的东西并允许对其自身进行哈希处理,并且相同原始数据的所有实例将具有相同的哈希值。可能有一些模块可以以更酷的方式执行以下操作,但我认为它可以作为一个示例具有教育意义。

一些警告:

  • 根据代理对象的字符串表示的构造方式,两个不同的代理对象可以获得相同的 hashDataFrame ,但是该实现与其他对象中的 s兼容。

  • 对原始对象的更改,将影响代理对象。

  • 如果另一个对象将相等性问题抛回来,则相等性将导致一些令人讨厌的无限要求(这就是为什么list有一个特殊情况)。

  • 代理DataFrame制造商帮助程序只是一个开始,问题是任何改变原始对象状态的方法都不能被允许,或者需要由帮助程序手动覆盖或在extraFilter实例化时完全被-参数屏蔽_ReadOnly。请参阅DataFrameProxy.sort.

  • 代理不会显示为派生自代理类型。

通用只读代理

这可以用在任何物体上。

import md5                                                                                              
import warnings                                                                                         

class _ReadOnly(object):                                                                                

    def __init__(self, obj, extraFilter=tuple()):                                                       

        self.__dict__['_obj'] = obj                                                                     
        self.__dict__['_d'] = None                                                                      
        self.__dict__['_extraFilter'] = extraFilter                                                     
        self.__dict__['_hash'] = int(md5.md5(str(obj)).hexdigest(), 16)                                 

    @staticmethod                                                                                       
    def _cloak(obj):                                                                                    
        try:                                                                                            
            hash(obj)                                                                                   
            return obj                                                                                  
        except TypeError:                                                                               
            return _ReadOnly(obj)                                                                       

    def __getitem__(self, value):                                                                       

        return _ReadOnly._cloak(self._obj[value])                                                       

    def __setitem__(self, key, value):                                                                  

        raise TypeError(                                                                                
            "{0} has a _ReadOnly proxy around it".format(type(self._obj)))                              

    def __delitem__(self, key):                                                                         

        raise TypeError(                                                                                
            "{0} has a _ReadOnly proxy around it".format(type(self._obj)))                              

    def __getattr__(self, value):                                                                       

        if value in self.__dir__():                                                                     
            return _ReadOnly._cloak(getattr(self._obj, value))                                          
        elif value in dir(self._obj):                                                                   
            raise AttributeError("{0} attribute {1} is cloaked".format(                                 
                type(self._obj), value))                                                                
        else:                                                                                           
            raise AttributeError("{0} has no {1}".format(                                               
                type(self._obj), value))                                                                

    def __setattr__(self, key, value):                                                                  

        raise TypeError(                                                                                
            "{0} has a _ReadOnly proxy around it".format(type(self._obj)))                              

    def __delattr__(self, key):                                                                         

        raise TypeError(                                                                                
            "{0} has a _ReadOnly proxy around it".format(type(self._obj)))                              

    def __dir__(self):                                                                                  

        if self._d is None:                                                                             
            self.__dict__['_d'] = [                                                                     
                i for i in dir(self._obj) if not i.startswith('set')                                    
                and i not in self._extraFilter]                                                         
        return self._d                                                                                  

    def __repr__(self):                                                                                 

        return self._obj.__repr__()                                                                     

    def __call__(self, *args, **kwargs):                                                                

        if hasattr(self._obj, "__call__"):                                                              
            return self._obj(*args, **kwargs)                                                           
        else:                                                                                           
            raise TypeError("{0} not callable".format(type(self._obj)))                                 

    def __hash__(self):                                                                                 

        return self._hash                                                                               

    def __eq__(self, other):                                                                            

        try:                                                                                            
            return hash(self) == hash(other)                                                            
        except TypeError:                                                                               
            if isinstance(other, list):                                                                 
                try:                                                                                    
                    return all(zip(self, other))                                                        
                except:                                                                                 
                    return False                                                                        
            return other == self    
Run Code Online (Sandbox Code Playgroud)

DataFrame 代理

确实应该用更多的方法进行扩展,例如sort过滤所有其他不感兴趣的状态改变方法。

您可以使用DataFrame-instance 作为唯一参数进行实例化,也可以像创建一个实例一样为其提供参数DataFrame

import pandas as pd

class DataFrameProxy(_ReadOnly):                                                                        

    EXTRA_FILTER = ('drop', 'drop_duplicates', 'dropna')                                                

    def __init__(self, *args, **kwargs):                                                                

        if (len(args) == 1 and                                                                          
                not len(kwargs) and                                                                     
                isinstance(args, pd.DataFrame)):                                                        

            super(DataFrameProxy, self).__init__(args[0],                                               
                DataFrameProxy.EXTRA_FILTER)                                                            

        else:                                                                                           

            super(DataFrameProxy, self).__init__(pd.DataFrame(*args, **kwargs),                         
                DataFrameProxy.EXTRA_FILTER)                                                            



    def sort(self, inplace=False, *args, **kwargs):                                                     

        if inplace:                                                                                     
            warnings.warn("Inplace sorting overridden")                                                 

        return self._obj.sort(*args, **kwargs) 
Run Code Online (Sandbox Code Playgroud)

最后:

然而,虽然制作这个装置很有趣,但为什么不干脆拥有一个DataFrame你不改变的东西呢?如果它只暴露给你,最好你确保不要改变它......

  • 干得好。我希望我能有一个不可变的 DataFrame,这是一个好的开始。我将 self._obj.values.flags.writeable=False 添加到混合中,并可能在返回的任何定位器上覆盖 __settiem__ (例如 df.iloc[0.0]=999) 。可能无法控制可变性,但你有良好的基础 (2认同)

Ass*_*saf 5

通过研究pandasPandas 的实现和利用功能,可以修补 DataFrame 对象以实现这一行为。我实现了一个名为make_dataframe_immutable(dataframe)解决这个问题的方法。写给熊猫==0.25.3,

编辑:为 pandas==1.0.5 和 pandas==1.1.4 添加了一个解决方案

新的 Pandas 版本可能需要调整 - 希望通过使用下面的测试不会太难做到。

这个解决方案是新的,没有经过彻底的测试——每一个反馈都将不胜感激。

如果有人可以在这里发布逆make_dataframe_mutable()方法,那就太好了。

import functools

import numpy as np
import pandas as pd
from pandas.core.indexing import _NDFrameIndexer


def make_dataframe_immutable(df: pd.DataFrame):
    """
    Makes the given DataFrame immutable.
    I.e. after calling this method - one cannot modify the dataframe using pandas interface.

    Upon a trial to modify an immutable dataframe, an exception of type ImmutablePandas is raised.
    """
    if getattr(df, "_is_immutable", False):
        return
    df._is_immutable = True
    df._set_value = functools.wraps(df._set_value)(_raise_immutable_exception)
    df._setitem_slice = functools.wraps(df._setitem_slice)(_raise_immutable_exception)
    df._setitem_frame = functools.wraps(df._setitem_frame)(_raise_immutable_exception)
    df._setitem_array = functools.wraps(df._setitem_array)(_raise_immutable_exception)
    df._set_item = functools.wraps(df._set_item)(_raise_immutable_exception)
    df._data.delete = functools.wraps(df._data.delete)(_raise_immutable_exception)
    df.update = functools.wraps(df.update)(_raise_immutable_exception)
    df.insert = functools.wraps(df.insert)(_raise_immutable_exception)

    df._get_item_cache = _make_result_immutable(df._get_item_cache)

    # prevent modification through numpy arrays
    df._data.as_array = _make_numpy_result_readonly(df._data.as_array)

    _prevent_inplace_argument_in_function_calls(
        df,
        # This list was obtained by manual inspection +
        #  [attr for attr in dir(d) if hasattr(getattr(pd.DataFrame, attr, None), '__code__') and
        #  'inplace' in getattr(pd.DataFrame, attr).__code__.co_varnames]
        (
            'bfill',
            'clip',
            'clip_lower',
            'clip_upper',
            'drop',
            'drop_duplicates',
            'dropna',
            'eval',
            'ffill',
            'fillna',
            'interpolate',
            'mask',
            'query',
            'replace',
            'reset_index',
            'set_axis',
            'set_index',
            'sort_index',
            'sort_values',
            'where',
            "astype",
            "assign",
            "reindex",
            "rename",
        ),
    )


def make_series_immutable(series: pd.Series):
    """
    Makes the given Series immutable.
    I.e. after calling this method - one cannot modify the series using pandas interface.


    Upon a trial to modify an immutable dataframe, an exception of type ImmutablePandas is raised.
    """
    if getattr(series, "_is_immutable", False):
        return
    series._is_immutable = True
    series._set_with_engine = functools.wraps(series._set_with_engine)(_raise_immutable_exception)
    series._set_with = functools.wraps(series._set_with)(_raise_immutable_exception)
    series.set_value = functools.wraps(series.set_value)(_raise_immutable_exception)

    # prevent modification through numpy arrays
    series._data.external_values = _make_numpy_result_readonly(series._data.external_values)
    series._data.internal_values = _make_numpy_result_readonly(series._data.internal_values)
    series._data.get_values = _make_numpy_result_readonly(series._data.get_values)

    _prevent_inplace_argument_in_function_calls(
        series,
        # This list was obtained by manual inspection +
        #  [attr for attr in dir(d) if hasattr(getattr(pd.Series, attr, None), '__code__') and
        #  'inplace' in getattr(pd.Series, attr).__code__.co_varnames]
        (
            "astype",
            'bfill',
            'clip',
            'clip_lower',
            'clip_upper',
            'drop',
            'drop_duplicates',
            'dropna',
            'ffill',
            'fillna',
            'interpolate',
            'mask',
            'replace',
            'reset_index',
            'set_axis',
            'sort_index',
            'sort_values',
            "valid",
            'where',
            "_set_name",
        ),
    )


class ImmutablePandas(Exception):
    pass


def _raise_immutable_exception(*args, **kwargs):
    raise ImmutablePandas(f"Cannot modify immutable dataframe. Please use df.copy()")


def _get_df_or_series_from_args(args):
    if len(args) >= 2 and (isinstance(args[1], pd.DataFrame) or isinstance(args[1], pd.Series)):
        return args[1]


def _safe__init__(self, *args, **kwargs):
    super(_NDFrameIndexer, self).__init__(*args, **kwargs)
    df_or_series = _get_df_or_series_from_args(args)
    if df_or_series is not None:
        if getattr(df_or_series, "_is_immutable", False):
            self._get_setitem_indexer = functools.wraps(self._get_setitem_indexer)(_raise_immutable_exception)


# This line is the greatest foul in this module - as it performs a global patch.
# Notice that a reload of this module incurs overriding this variable again and again. It is supported.
_NDFrameIndexer.__init__ = functools.wraps(_NDFrameIndexer.__init__)(_safe__init__)


def _make_numpy_result_readonly(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        res = func(*args, **kwargs)
        if isinstance(res, np.ndarray):
            res.flags.writeable = False
        return res

    return wrapper


def _make_result_immutable(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        res = func(*args, **kwargs)
        if isinstance(res, pd.Series):
            make_series_immutable(res)
        return res

    return wrapper


def _prevent_inplace_operation(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        # TODO: here we assume that in-place is not given as a positional.
        #  remove this assumption, either by hard-coding the position for each method or by parsing the
        #  function signature.
        if kwargs.get("inplace", False):
            _raise_immutable_exception()
        return func(*args, **kwargs)

    return wrapper


def _prevent_inplace_argument_in_function_calls(obj, attributes):
    for attr in attributes:
        member = getattr(obj, attr)
        setattr(obj, attr, _prevent_inplace_operation(member))


Run Code Online (Sandbox Code Playgroud)

pytest 单元测试

import immutable_pandas
import importlib
import warnings

import pandas as pd
import pytest



def create_immutable_dataframe() -> pd.DataFrame:
    # Cannot be used as a fixture because pytest copies objects transparently, which makes the tests flaky
    immutable_dataframe = pd.DataFrame({"x": [1, 2, 3, 4], "y": [4, 5, 6, 7]})
    make_dataframe_immutable(immutable_dataframe)
    return immutable_dataframe


def test_immutable_dataframe_cannot_change_with_direct_access():
    immutable_dataframe = create_immutable_dataframe()
    immutable_dataframe2 = immutable_dataframe.query("x == 2")
    with warnings.catch_warnings():
        warnings.simplefilter("ignore")
        immutable_dataframe2["moshe"] = 123
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.x = 2
    with pytest.raises(ImmutablePandas):
        immutable_dataframe["moshe"] = 56
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.insert(0, "z", [1, 2, 3, 4])


def test_immutable_dataframe_cannot_change_with_inplace_operations():
    immutable_dataframe = create_immutable_dataframe()
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.eval("y=x+1", inplace=True)
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.assign(y=2, inplace=True)


def test_immutable_dataframe_cannot_change_with_loc():
    immutable_dataframe = create_immutable_dataframe()
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.loc[2] = 1
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.iloc[1] = 4


def test_immutable_dataframe_cannot_change_with_columns_access():
    immutable_dataframe = create_immutable_dataframe()
    with pytest.raises(ImmutablePandas):
        immutable_dataframe["x"][2] = 123
    with pytest.raises(ImmutablePandas):
        immutable_dataframe["x"].loc[2] = 123


def test_immutable_dataframe_cannot_del_column():
    immutable_dataframe = create_immutable_dataframe()
    with pytest.raises(ImmutablePandas):
        del immutable_dataframe["x"]


def test_immutable_dataframe_cannot_be_modified_through_values():
    immutable_dataframe = create_immutable_dataframe()
    with pytest.raises(ValueError, match="read-only"):
        immutable_dataframe.values[0, 0] = 1
    with pytest.raises(ValueError, match="read-only"):
        immutable_dataframe.as_matrix()[0, 0] = 1


def test_immutable_series_cannot_change_with_loc():
    series = pd.Series([1, 2, 3, 4])
    make_series_immutable(series)
    with pytest.raises(ImmutablePandas):
        series.loc[0] = 1
    with pytest.raises(ImmutablePandas):
        series.iloc[0] = 1


def test_immutable_series_cannot_change_with_inplace_operations():
    series = pd.Series([1, 2, 3, 4])
    make_series_immutable(series)
    with pytest.raises(ImmutablePandas):
        series.sort_index(inplace=True)
    with pytest.raises(ImmutablePandas):
        series.sort_values(inplace=True)
    with pytest.raises(ImmutablePandas):
        series.astype(int, inplace=True)


def test_series_cannot_be_modeified_through_values():
    series = pd.Series([1, 2, 3, 4])
    make_series_immutable(series)
    with pytest.raises(ValueError, match="read-only"):
        series.get_values()[0] = 1234
    series = pd.Series([1, 2, 3, 4])
    make_series_immutable(series)
    with pytest.raises(ValueError, match="read-only"):
        series.values[0] = 1234


def test_reloading_module_immutable_pandas_does_not_break_immutability():
    # We need to test the effects of reloading the module, because we modify the global variable
    #       _NDFrameIndexer.__init__ upon every reload of the module.
    df = create_immutable_dataframe()
    df2 = df.copy()
    immutable_pandas2 = importlib.reload(immutable_pandas)
    with pytest.raises(immutable_pandas2.ImmutablePandas):
        df.astype(int, inplace=True)
    df2.astype(int, inplace=True)
    immutable_pandas2.make_dataframe_immutable(df2)
    with pytest.raises(immutable_pandas2.ImmutablePandas):
        df2.astype(int, inplace=True)


Run Code Online (Sandbox Code Playgroud)

编辑:这是在 pandas==1.0.5 和 pandas==1.1.4 上测试的更新

"""
Two methods to make pandas objects immutable.
    make_dataframe_immutable()
    make_series_immutable()
"""
import functools

import numpy as np
import pandas as pd
from pandas.core.indexing import _iLocIndexer
from pandas.core.indexing import _LocIndexer
from pandas.core.indexing import IndexingMixin


def make_dataframe_immutable(df: pd.DataFrame):
    """
    Makes the given DataFrame immutable.
    I.e. after calling this method - one cannot modify the dataframe using pandas interface.

    Upon a trial to modify an immutable dataframe, an exception of type ImmutablePandas is raised.
    """
    if getattr(df, "_is_immutable", False):
        return
    df._is_immutable = True
    df._set_value = functools.wraps(df._set_value)(_raise_immutable_exception)
    df._setitem_slice = functools.wraps(df._setitem_slice)(_raise_immutable_exception)
    df._setitem_frame = functools.wraps(df._setitem_frame)(_raise_immutable_exception)
    df._setitem_array = functools.wraps(df._setitem_array)(_raise_immutable_exception)
    df._set_item = functools.wraps(df._set_item)(_raise_immutable_exception)
    if hasattr(df, "_mgr"):
        # pandas==1.1.4
        df._mgr.idelete = functools.wraps(df._mgr.idelete)(_raise_immutable_exception)
    elif hasattr(df, "_data"):
        # pandas==1.0.5
        df._data.delete = functools.wraps(df._data.delete)(_raise_immutable_exception)
    df.update = functools.wraps(df.update)(_raise_immutable_exception)
    df.insert = functools.wraps(df.insert)(_raise_immutable_exception)

    df._get_item_cache = _make_result_immutable(df._get_item_cache)

    # prevent modification through numpy arrays
    df._data.as_array = _make_numpy_result_readonly(df._data.as_array)

    _prevent_inplace_argument_in_function_calls(
        df,
        # This list was obtained by manual inspection +
        #  [attr for attr in dir(d) if hasattr(getattr(pd.DataFrame, attr, None), '__code__') and
        #  'inplace' in getattr(pd.DataFrame, attr).__code__.co_varnames]
        (
            "bfill",
            "clip",
            "drop",
            "drop_duplicates",
            "dropna",
            "eval",
            "ffill",
            "fillna",
            "interpolate",
            "mask",
            "query",
            "replace",
            "reset_index",
            "set_axis",
            "set_index",
            "sort_index",
            "sort_values",
            "where",
            "astype",
            "assign",
            "reindex",
            "rename",
        ),
    )


def make_series_immutable(series: pd.Series):
    """
    Makes the given Series immutable.
    I.e. after calling this method - one cannot modify the series using pandas interface.


    Upon a trial to modify an immutable dataframe, an exception of type ImmutablePandas is raised.
    """
    if getattr(series, "_is_immutable", False):
        return
    series._is_immutable = True
    series._set_with_engine = functools.wraps(series._set_with_engine)(_raise_immutable_exception)
    series._set_with = functools.wraps(series._set_with)(_raise_immutable_exception)

    # prevent modification through numpy arrays
    series._data.external_values = _make_numpy_result_readonly(series._data.external_values)
    series._data.internal_values = _make_numpy_result_readonly(series._data.internal_values)

    _prevent_inplace_argument_in_function_calls(
        series,
        # This list was obtained by manual inspection +
        #  [attr for attr in dir(d) if hasattr(getattr(pd.Series, attr, None), '__code__') and
        #  'inplace' in getattr(pd.Series, attr).__code__.co_varnames]
        (
            "astype",
            "bfill",
            "clip",
            "drop",
            "drop_duplicates",
            "dropna",
            "ffill",
            "fillna",
            "interpolate",
            "mask",
            "replace",
            "reset_index",
            "set_axis",
            "sort_index",
            "sort_values",
            "where",
            "_set_name",
        ),
    )


class ImmutablePandas(Exception):
    pass


def _raise_immutable_exception(*args, **kwargs):
    raise ImmutablePandas(f"Cannot modify immutable dataframe. Please use df.copy()")


def _get_df_or_series_from_args(args):
    if len(args) >= 2 and (isinstance(args[1], pd.DataFrame) or isinstance(args[1], pd.Series)):
        return args[1]


def _protect_indexer(loc_func):
    def wrapper(*arg, **kwargs):
        res = loc_func(*args, **kwargs)
        return res


def _safe__init__(cls, self, *args, **kwargs):
    super(cls, self).__init__(*args, **kwargs)
    df_or_series = _get_df_or_series_from_args(args)
    if df_or_series is not None:
        if getattr(df_or_series, "_is_immutable", False):
            self._get_setitem_indexer = functools.wraps(self._get_setitem_indexer)(_raise_immutable_exception)


@functools.wraps(IndexingMixin.loc)
def _safe_loc(self):
    loc = _LocIndexer("loc", self)
    if getattr(self, "_is_immutable", False):
        # Edit also loc._setitem_with_indexer
        loc._get_setitem_indexer = functools.wraps(loc._get_setitem_indexer)(_raise_immutable_exception)
    return loc


@functools.wraps(IndexingMixin.iloc)
def _safe_iloc(self):
    iloc = _iLocIndexer("iloc", self)
    if getattr(self, "_is_immutable", False):
        # Edit also iloc._setitem_with_indexer
        iloc._get_setitem_indexer = functools.wraps(iloc._get_setitem_indexer)(_raise_immutable_exception)
    return iloc


# wraps
pd.DataFrame.loc = property(_safe_loc)
pd.Series.loc = property(_safe_loc)
pd.DataFrame.iloc = property(_safe_iloc)
pd.Series.iloc = property(_safe_iloc)


def _make_numpy_result_readonly(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        res = func(*args, **kwargs)
        if isinstance(res, np.ndarray):
            res.flags.writeable = False
        return res

    return wrapper


def _make_result_immutable(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        res = func(*args, **kwargs)
        if isinstance(res, pd.Series):
            make_series_immutable(res)
        return res

    return wrapper


def _prevent_inplace_operation(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        # TODO: here we assume that in-place is not given as a positional.
        #  remove this assumption, either by hard-coding the position for each method or by parsing the
        #  function signature.
        if kwargs.get("inplace", False):
            _raise_immutable_exception()
        return func(*args, **kwargs)

    return wrapper


def _prevent_inplace_argument_in_function_calls(obj, attributes):
    for attr in attributes:
        member = getattr(obj, attr)
        setattr(obj, attr, _prevent_inplace_operation(member))


Run Code Online (Sandbox Code Playgroud)

和 pytest 文件

import importlib
import warnings

import pandas as pd
import pytest

import immutable_pandas
from immutable_pandas import ImmutablePandas
from immutable_pandas import make_dataframe_immutable
from immutable_pandas import make_series_immutable


def create_immutable_dataframe() -> pd.DataFrame:
    # Cannot be used as a fixture because pytest copies objects transparently, which makes the tests flaky
    immutable_dataframe = pd.DataFrame({"x": [1, 2, 3, 4], "y": [4, 5, 6, 7]})
    make_dataframe_immutable(immutable_dataframe)
    return immutable_dataframe


def test_immutable_dataframe_cannot_change_with_direct_access():
    immutable_dataframe = create_immutable_dataframe()
    immutable_dataframe2 = immutable_dataframe.query("x == 2")
    with warnings.catch_warnings():
        warnings.simplefilter("ignore")
        immutable_dataframe2["moshe"] = 123
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.x = 2
    with pytest.raises(ImmutablePandas):
        immutable_dataframe["moshe"] = 56
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.insert(0, "z", [1, 2, 3, 4])


def test_immutable_dataframe_cannot_change_with_inplace_operations():
    immutable_dataframe = create_immutable_dataframe()
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.eval("y=x+1", inplace=True)
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.assign(y=2, inplace=True)


def test_immutable_dataframe_cannot_change_with_loc():
    immutable_dataframe = create_immutable_dataframe()
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.loc[2] = 1
    with pytest.raises(ImmutablePandas):
        immutable_dataframe.iloc[1] = 4


def test_immutable_dataframe_cannot_change_with_columns_access():
    immutable_dataframe = create_immutable_dataframe()
    with pytest.raises(ImmutablePandas):
        immutable_dataframe["x"][2] = 123
    with pytest.raises(ImmutablePandas):
        immutable_dataframe["x"].loc[2] = 123


def test_immutable_dataframe_cannot_del_column():
    immutable_dataframe = create_immutable_dataframe()
    with pytest.raises(ImmutablePandas):
        del immutable_dataframe["x"]


def test_immutable_dataframe_cannot_be_modified_through_values():
    immutable_dataframe = create_immutable_dataframe()
    wit