use*_*157 6 python python-3.x pandas
现在我有这样的DF
Word Word2 Word3
Hello NaN NaN
My My Name NaN
Yellow Yellow Bee Yellow Bee Hive
Golden Golden Gates NaN
Yellow NaN NaN
Run Code Online (Sandbox Code Playgroud)
我希望的是从我的数据框中删除所有NaN细胞.所以最后,它看起来像这样,'Yellow Bee Hive'已经移动到第1行(类似于从excel中的列中删除单元格时发生的情况):
Word Word2 Word3
1 Hello My Name Yellow Bee Hive
2 My Yellow Bee
3 Yellow Golden Gates
4 Golden
5 Yellow
Run Code Online (Sandbox Code Playgroud)
不幸的是,这些都不起作用,因为他们删除了整条行!
df = df[pd.notnull(df['Word','Word2','Word3'])]
Run Code Online (Sandbox Code Playgroud)
要么
df = df.dropna()
Run Code Online (Sandbox Code Playgroud)
有人有什么建议吗?我应该重新索引桌子吗?
import numpy as np
import pandas as pd
import functools
def drop_and_roll(col, na_position='last', fillvalue=np.nan):
result = np.full(len(col), fillvalue, dtype=col.dtype)
mask = col.notnull()
N = mask.sum()
if na_position == 'last':
result[:N] = col.loc[mask]
elif na_position == 'first':
result[-N:] = col.loc[mask]
else:
raise ValueError('na_position {!r} unrecognized'.format(na_position))
return result
df = pd.read_table('data', sep='\s{2,}')
print(df.apply(functools.partial(drop_and_roll, fillvalue='')))
Run Code Online (Sandbox Code Playgroud)
产量
Word Word2 Word3
0 Hello My Name Yellow Bee Hive
1 My Yellow Bee
2 Yellow Golden Gates
3 Golden
4 Yellow
Run Code Online (Sandbox Code Playgroud)