Nic*_*e.P 5 python dataframe pandas
我有一个包含名称、颜色、重量、大小、种子的水果数据集
Fruit dataset
Name Colour Weight Size Seeds Unnamed
Apple Apple Red 10.0 Big Yes
Apple Apple Red 5.0 Small Yes
Pear Pear Green 11.0 Big Yes
Banana Banana Yellow 4.0 Small Yes
Orange Orange Orange 5.0 Small Yes
Run Code Online (Sandbox Code Playgroud)
问题在于,颜色列是名称的重复列,并且值向右移动 1 列,从而创建了一个无用的列(未命名),其中包含属于列种子的值。是否有一种简单的方法可以删除 Color 中的重复值并将其余的列值从 weight 开始向左移回 1 列。我希望我不会在这里混淆任何人。
想要的结果
Fruit dataset
Name Colour Weight Size Seeds Unnamed(will be dropped)
Apple Red 10.0 Big Yes
Apple Red 5.0 Small Yes
Pear Green 11.0 Big Yes
Banana Yellow 4.0 Small Yes
Orange Orange 5.0 Small Yes
Run Code Online (Sandbox Code Playgroud)
你可以这样做:
In [23]: df
Out[23]:
Name Colour Weight Size Seeds Unnamed
0 Apple Apple Red 10.0 Big Yes
1 Apple Apple Red 5.0 Small Yes
2 Pear Pear Green 11.0 Big Yes
3 Banana Banana Yellow 4.0 Small Yes
4 Orange Orange Orange 5.0 Small Yes
In [24]: cols = df.columns[:-1]
In [25]: cols
Out[25]: Index(['Name', 'Colour', 'Weight', 'Size', 'Seeds'], dtype='object')
In [26]: df = df.drop('Colour', 1)
In [27]: df.columns = cols
In [28]: df
Out[28]:
Name Colour Weight Size Seeds
0 Apple Red 10.0 Big Yes
1 Apple Red 5.0 Small Yes
2 Pear Green 11.0 Big Yes
3 Banana Yellow 4.0 Small Yes
4 Orange Orange 5.0 Small Yes
Run Code Online (Sandbox Code Playgroud)