如何在 pandas 中将多列折叠为一列

hed*_*dge 5 python dataframe pandas

我有一个填充了用户和类别的 pandas 数据框,但这些类别有多个列。

|   user  |       category    | val1 | val2 | val3 |
| ------  | ------------------| -----| ---- | ---- |
| user 1  | c1                |   3  |  NA  | None |
| user 1  | c2                |   NA |  4   | None |
| user 1  | c3                |   NA |  NA  | 7    |
| user 2  | c1                |   5  |  NA  | None |
| user 2  | c2                |   NA |  7   | None |
| user 2  | c3                |   NA |  NA  | 2    |
Run Code Online (Sandbox Code Playgroud)

我想得到它,以便将值压缩到单个列中。

|   user  |       category    | value|
| ------  | ------------------| -----| 
| user 1  | c1                |   3  | 
| user 1  | c2                |   4  | 
| user 1  | c3                |   7  |
| user 2  | c1                |   5  | 
| user 2  | c2                |   7  | 
| user 2  | c3                |   2  |
Run Code Online (Sandbox Code Playgroud)

最终得到如下矩阵:

np.array([[3, 4, 7], [5, 7, 2]])
Run Code Online (Sandbox Code Playgroud)

jpp*_*jpp 6

您可以用来pd.DataFrame.bfill回填所选列的值。

val_cols = ['val1', 'val2', 'val3']

df['value'] = pd.to_numeric(df[val_cols].bfill(axis=1).iloc[:, 0], errors='coerce')

print(df)

    user0 category  val1  val2  val3  value
0  user 1       c1   3.0   NaN  None    3.0
1  user 1       c2   NaN   4.0  None    4.0
2  user 1       c3   NaN   NaN  7       7.0
3  user 2       c1   5.0   NaN  None    5.0
4  user 2       c2   NaN   7.0  2       7.0
5  user 2       c3   NaN   NaN  2       2.0
Run Code Online (Sandbox Code Playgroud)