Python:使用列表中的随机元素在pandas列中填充"na"

baz*_*nga 1 python pandas

我试图通过从列表中随机选择元素来填充pandas列中的"NA".

例如:

import pandas as pd
df = pandas.DataFrame()
df['A'] = [1, 2, None, 5, 53, None]
fill_list = [22, 56, 84]
Run Code Online (Sandbox Code Playgroud)

是否可以编写一个函数,它将带有列名的pandas DF作为输入,并通过从列表'fill_list'中随机选择元素来替换所有NA?

fun(df['column_name'], fill_list])
Run Code Online (Sandbox Code Playgroud)

jez*_*ael 5

创建新Seriesnumpy.random.choice,然后用或替换NaNs :fillnacombine_first

df['A'] = df['A'].fillna(pd.Series(np.random.choice(fill_list, size=len(df.index))))
#alternative
#df['A'] = df['A'].combine_first(pd.Series(np.random.choice(fill_list, size=len(df.index))))
print (df)
      A
0   1.0
1   2.0
2  84.0
3   5.0
4  53.0
5  56.0
Run Code Online (Sandbox Code Playgroud)

要么:

#get mask of NaNs
m = df['A'].isnull()
#count rows with NaNs
l = m.sum()
#create array with size l
s = np.random.choice(fill_list, size=l)
#set NaNs values
df.loc[m, 'A'] = s
print (df)
      A
0   1.0
1   2.0
2  56.0
3   5.0
4  53.0
5  56.0
Run Code Online (Sandbox Code Playgroud)