在构造函数中子类化熊猫数据框和设置字段

jwi*_*720 5 python dataframe pandas

我正在尝试对pandas 数据结构进行子类化。如果我在实例上设置一个字段,它工作正常。

import seaborn as sns
import pandas as pd
df = sns.load_dataset('iris')

class Results(pd.DataFrame):
    def __init__(self, *args, **kwargs):
        # use the __init__ method from DataFrame to ensure
        # that we're inheriting the correct behavior
        super(Results, self).__init__(*args, **kwargs)

    @property
    def _constructor(self):
        return Results
    
result_object = Results(df)
result_object['scheme'] = 'not_default'
print(result_object.head(5))

>>>   sepal_length  sepal_width  petal_length  petal_width species       scheme
0           5.1          3.5           1.4          0.2  setosa  not_default
1           4.9          3.0           1.4          0.2  setosa  not_default
2           4.7          3.2           1.3          0.2  setosa  not_default
3           4.6          3.1           1.5          0.2  setosa  not_default
4           5.0          3.6           1.4          0.2  setosa  not_default
Run Code Online (Sandbox Code Playgroud)

我不太了解_constructor引擎盖下的方法,无法说明为什么这不起作用。

import seaborn as sns
import pandas as pd
df = sns.load_dataset('iris')

class Results(pd.DataFrame):
    def __init__(self, *args,scheme='default', **kwargs):
        # use the __init__ method from DataFrame to ensure
        # that we're inheriting the correct behavior
        super(Results, self).__init__(*args, **kwargs)
        self['scheme'] = scheme

    @property
    def _constructor(self):
        return Results

result_object = Results(df.copy(),scheme='not_default')
print(result_object.head(5))

>>>
# scheme is still 'default'
   sepal_length  sepal_width  petal_length  petal_width species   scheme
0           5.1          3.5           1.4          0.2  setosa  default
1           4.9          3.0           1.4          0.2  setosa  default
2           4.7          3.2           1.3          0.2  setosa  default
3           4.6          3.1           1.5          0.2  setosa  default
4           5.0          3.6           1.4          0.2  setosa  default
Run Code Online (Sandbox Code Playgroud)

请注意该scheme字段仍然显示默认值。

无论如何要在实例构造函数中设置一个字段?

tdy*_*tdy 4

您当前的版本创建scheme为属性(如.index, .columns):

result_object.scheme

# 0      not_default
# 1      not_default
#           ...     
# 148    not_default
# 149    not_default
# Name: scheme, Length: 150, dtype: object
Run Code Online (Sandbox Code Playgroud)

为了使其成为正确的列,您可以在将传入的内容data发送到之前对其进行修改super()

class Results(pd.DataFrame):
    def __init__(self, data=None, *args, scheme='default', **kwargs):

        # add column to incoming data
        if isinstance(data, pd.DataFrame):
            data['scheme'] = scheme

        super(Results, self).__init__(data=data, *args, **kwargs)

    @property
    def _constructor(self):
        return Results

df = sns.load_dataset('iris')
result_object = Results(df.copy(), scheme='not_default')

#    sepal_length  sepal_width  petal_length  petal_width species       scheme
# 0           5.1          3.5           1.4          0.2  setosa  not_default
# 1           4.9          3.0           1.4          0.2  setosa  not_default
# 2           4.7          3.2           1.3          0.2  setosa  not_default
# 3           4.6          3.1           1.5          0.2  setosa  not_default
# ...         ...          ...           ...          ...     ...          ...
Run Code Online (Sandbox Code Playgroud)