我有一个带有所有int64类型列的DataFrame。
City Val ...
0 3 1
1 2 43
2 0 32
3 1 54
Run Code Online (Sandbox Code Playgroud)
然后,我列出了类别名称:
names = ['Sydney', 'Tokyo', 'Vancouver', 'Toronto']
Run Code Online (Sandbox Code Playgroud)
我要基于names列表索引(即0 =“悉尼”和1 =“东京”),用城市名称填充“城市”列。
理想的结果:
City Val ...
0 Toronto 1
1 Vancouver 43
2 Sydney 32
3 Tokyo 54
Run Code Online (Sandbox Code Playgroud)
我试过:df['City'].loc[df['City'].isin(names), df['City']]=names.index(df['City']),但出现错误
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Run Code Online (Sandbox Code Playgroud)
并且,我想将“城市”列更改为类别类型。
df['City'] = df['City'].astype('category')
df['City'].cat.set_categories(names, ordered=True, inplace=True)
Run Code Online (Sandbox Code Playgroud)
Series.map与创建的字典配合使用enumerate:
names = ['Sydney', 'Tokyo', 'Vancouver', 'Toronto']
df['City'] = df['City'].map(dict(enumerate(names)))
print (df)
City Val
0 Toronto 1
1 Vancouver 43
2 Sydney 32
3 Tokyo 54
Run Code Online (Sandbox Code Playgroud)
详细说明:
print (dict(enumerate(names)))
{0: 'Sydney', 1: 'Tokyo', 2: 'Vancouver', 3: 'Toronto'}
Run Code Online (Sandbox Code Playgroud)
然后对于分类:
df['City'] = pd.CategoricalIndex(df['City'].map(dict(enumerate(names))),
ordered=True,
categories=names)
Run Code Online (Sandbox Code Playgroud)
要么:
df['City'] = (df['City'].map(dict(enumerate(names)))
.astype('category', ordered=True, categories=names))
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
29 次 |
| 最近记录: |