use*_*628 1 python dataframe pandas
以下是我正在使用的数据示例。
userID | preference
-------------------
user1 | NaN
user1 | NaN
user1 | coffee
user2 | NaN
user2 | tea
user2 | NaN
user3 | NaN
user3 | NaN
user3 | NaN
.
.
.
Run Code Online (Sandbox Code Playgroud)
某些用户缺少空首选项。我想用每个用户存在的第一个非空字符串填充首选项。我的最终 DataFrame 输出应如下所示
userID | preference
-------------------
user1 | coffee
user1 | coffee
user1 | coffee
user2 | tea
user2 | tea
user2 | tea
.
.
.
Run Code Online (Sandbox Code Playgroud)
groupby transform与 一起使用first。first如果存在,将获得每组的第一个有效值:
df["preference"] = df.groupby("userID")["preference"].transform('first')
Run Code Online (Sandbox Code Playgroud)
df:
userID preference
0 user1 coffee
1 user1 coffee
2 user1 coffee
3 user2 tea
4 user2 tea
5 user2 tea
6 user3 None
7 user3 None
8 user3 None
Run Code Online (Sandbox Code Playgroud)
数据框和导入:
import pandas as pd
from numpy import nan
df = pd.DataFrame({
'userID': {0: 'user1', 1: 'user1', 2: 'user1', 3: 'user2', 4: 'user2',
5: 'user2', 6: 'user3', 7: 'user3', 8: 'user3'},
'preference': {0: nan, 1: nan, 2: 'coffee', 3: nan, 4: 'tea', 5: nan,
6: nan, 7: nan, 8: nan}
})
Run Code Online (Sandbox Code Playgroud)