Nat*_*ent 2 python numpy pandas
我想知道第一年有各种项目的收入.
鉴于以下内容,数据帧:
ID Y1 Y2 Y3
0 NaN 8 4
1 NaN NaN 1
2 NaN NaN NaN
3 5 3 NaN
Run Code Online (Sandbox Code Playgroud)
我想按行返回第一列的名称,其中包含非空值.
在这种情况下,我想要返回:
['Y2','Y3',NaN,'Y1']
Run Code Online (Sandbox Code Playgroud)
我的目标是将其添加为原始数据框的列.
以下代码主要有效,但真的很笨重.
import pandas as pd
import numpy as np
df = pd.DataFrame({'Y1':[np.nan, np.nan, np.nan, 5],'Y2':[8, np.nan, np.nan, 3], 'Y3':[4, 1, np.nan, np.nan]})
df['first'] = np.nan
for ID in df.index:
row = df.loc[ID,]
for i in range(0,len(row)):
if (~pd.isnull(row[i])):
df.loc[ID,'first'] = row.index[i]
break
Run Code Online (Sandbox Code Playgroud)
收益:
Y1 Y2 Y3 first
0 NaN 8 4 Y2
1 NaN NaN 1 Y3
2 NaN NaN NaN first
3 5 3 NaN Y1
Run Code Online (Sandbox Code Playgroud)
有谁知道更优雅的解决方案?
您可以first_valid_index使用轴= 1的lambda表达式应用于数据框中的每一行以指定行.
>>> df.apply(lambda row: row.first_valid_index(), axis=1)
ID
0 Y2
1 Y3
2 None
3 Y1
dtype: object
Run Code Online (Sandbox Code Playgroud)
要将其应用于您的数据框:
df = df.assign(first = df.apply(lambda row: row.first_valid_index(), axis=1))
>>> df
Y1 Y2 Y3 first
ID
0 NaN 8 4 Y2
1 NaN NaN 1 Y3
2 NaN NaN NaN None
3 5 3 NaN Y1
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
836 次 |
| 最近记录: |