第一列名称,行大熊猫非空值

Nat*_*ent 2 python numpy pandas

我想知道第一年有各种项目的收入.

鉴于以下内容,数据帧:

ID  Y1      Y2      Y3
0   NaN     8       4
1   NaN     NaN     1
2   NaN     NaN     NaN
3   5       3       NaN
Run Code Online (Sandbox Code Playgroud)

我想按行返回第一列的名称,其中包含非空值.

在这种情况下,我想要返回:

['Y2','Y3',NaN,'Y1']
Run Code Online (Sandbox Code Playgroud)

我的目标是将其添加为原始数据框的列.

以下代码主要有效,但真的很笨重.

import pandas as pd
import numpy as np

df = pd.DataFrame({'Y1':[np.nan, np.nan, np.nan, 5],'Y2':[8, np.nan, np.nan, 3], 'Y3':[4, 1, np.nan, np.nan]})
df['first'] = np.nan

for ID in df.index:
row = df.loc[ID,]
for i in range(0,len(row)):
    if (~pd.isnull(row[i])):
        df.loc[ID,'first'] = row.index[i]
        break
Run Code Online (Sandbox Code Playgroud)

收益:

   Y1  Y2  Y3  first
0 NaN  8   4   Y2   
1 NaN NaN  1   Y3   
2 NaN NaN NaN  first
3  5   3  NaN  Y1   
Run Code Online (Sandbox Code Playgroud)

有谁知道更优雅的解决方案?

Ale*_*der 8

您可以first_valid_index使用轴= 1的lambda表达式应用于数据框中的每一行以指定行.

>>> df.apply(lambda row: row.first_valid_index(), axis=1)
ID
0      Y2
1      Y3
2    None
3      Y1
dtype: object
Run Code Online (Sandbox Code Playgroud)

要将其应用于您的数据框:

df = df.assign(first = df.apply(lambda row: row.first_valid_index(), axis=1))

>>> df
    Y1  Y2  Y3 first
ID                  
0  NaN   8   4    Y2
1  NaN NaN   1    Y3
2  NaN NaN NaN  None
3    5   3 NaN    Y1
Run Code Online (Sandbox Code Playgroud)