Pandas - 从分类列创建布尔列

Hon*_*zaB 6 python dataframe pandas

我在Pandas数据框中有一个Place,它看起来像这样:

**Place**
Berlin
Prague
Mexico
Prague
Mexico
...
Run Code Online (Sandbox Code Playgroud)

我想做以下事情:

is_Berlin   is_Prague   is_Mexico
1           0           0
0           1           0
0           0           1
0           1           0
0           0           1   
Run Code Online (Sandbox Code Playgroud)

我知道我可以单独创建列:

df['is_Berlin'] = df['Place']
df['is_Prague'] = df['Place']
df['is_Mexico'] = df['Place']
Run Code Online (Sandbox Code Playgroud)

然后为每列创建一个字典并应用一个map函数.

#Example just for is_Berlin column
d = {'Berlin': 1,'Prague': 0,'Mexico': 0} 
df['is_Berlin'] = df['is_Berlin'].map(d)
Run Code Online (Sandbox Code Playgroud)

但我觉得这有点单调乏味,我相信有很好的pythonic方式如何做到这一点.

jez*_*ael 8

您可以使用str.get_dummies并且如果需要将此新列添加到原始列DataFrame,请使用concat:

df1 = df.Place.str.get_dummies()
print df1
   Berlin  Mexico  Prague
0       1       0       0
1       0       0       1
2       0       1       0
3       0       0       1
4       0       1       0

df1.columns = ['is_' + col for col in df1.columns]
print df1
   is_Berlin  is_Mexico  is_Prague
0          1          0          0
1          0          0          1
2          0          1          0
3          0          0          1
4          0          1          0
Run Code Online (Sandbox Code Playgroud)
df = pd.concat([df, df1], axis=1)
print df
    Place  is_Berlin  is_Mexico  is_Prague
0  Berlin          1          0          0
1  Prague          0          0          1
2  Mexico          0          1          0
3  Prague          0          0          1
4  Mexico          0          1          0

#if there is more columns, you can drop Place column
df = df.drop('Place', axis=1)
print df
   is_Berlin  is_Mexico  is_Prague
0          1          0          0
1          0          0          1
2          0          1          0
3          0          0          1
4          0          1          0
Run Code Online (Sandbox Code Playgroud)