Pandas:用第一个非空值为用户填写列的值

use*_*628 1 python dataframe pandas

以下是我正在使用的数据示例。

userID | preference
------------------- 
user1  | NaN
user1  | NaN
user1  | coffee
user2  | NaN
user2  | tea
user2  | NaN 
user3  | NaN 
user3  | NaN 
user3  | NaN 
.
.
.
Run Code Online (Sandbox Code Playgroud)

某些用户缺少空首选项。我想用每个用户存在的第一个非空字符串填充首选项。我的最终 DataFrame 输出应如下所示

userID | preference 
-------------------
user1  | coffee
user1  | coffee
user1  | coffee
user2  | tea
user2  | tea
user2  | tea 
.
.
.
Run Code Online (Sandbox Code Playgroud)

Hen*_*ker 5

groupby transform与 一起使用firstfirst如果存在,将获得每组的第一个有效值:

df["preference"] = df.groupby("userID")["preference"].transform('first')
Run Code Online (Sandbox Code Playgroud)

df

  userID preference
0  user1     coffee
1  user1     coffee
2  user1     coffee
3  user2        tea
4  user2        tea
5  user2        tea
6  user3       None
7  user3       None
8  user3       None
Run Code Online (Sandbox Code Playgroud)

数据框和导入:

import pandas as pd
from numpy import nan

df = pd.DataFrame({
    'userID': {0: 'user1', 1: 'user1', 2: 'user1', 3: 'user2', 4: 'user2',
               5: 'user2', 6: 'user3', 7: 'user3', 8: 'user3'},
    'preference': {0: nan, 1: nan, 2: 'coffee', 3: nan, 4: 'tea', 5: nan,
                   6: nan, 7: nan, 8: nan}
})
Run Code Online (Sandbox Code Playgroud)