rya*_*les 14 python null dataframe pandas
我正在尝试从我的数据框中删除一行,其中一列的值为null.我能找到的大部分帮助都与删除迄今为止对我无效的NaN值有关.
在这里,我创建了数据框:
# successfully crated data frame
df1 = ut.get_data(symbols, dates) # column heads are 'SPY', 'BBD'
# can't get rid of row containing null val in column BBD
# tried each of these with the others commented out but always had an
# error or sometimes I was able to get a new column of boolean values
# but i just want to drop the row
df1 = pd.notnull(df1['BBD']) # drops rows with null val, not working
df1 = df1.drop(2010-05-04, axis=0)
df1 = df1[df1.'BBD' != null]
df1 = df1.dropna(subset=['BBD'])
df1 = pd.notnull(df1.BBD)
# I know the date to drop but still wasn't able to drop the row
df1.drop([2015-10-30])
df1.drop(['2015-10-30'])
df1.drop([2015-10-30], axis=0)
df1.drop(['2015-10-30'], axis=0)
with pd.option_context('display.max_row', None):
print(df1)
Run Code Online (Sandbox Code Playgroud)
这是我的输出:

有人可以告诉我如何删除这一行,最好是通过空值识别行以及如何按日期删除?
我一直没有和熊猫一起工作过,我已经坚持了一个小时.任何建议将不胜感激.
Mar*_*erc 20
这应该做的工作:
df = df.dropna(how='any',axis=0)
Run Code Online (Sandbox Code Playgroud)
它将擦除其中包含" 任何 "Null值的每一行(axis = 0).
例:
#Recreate random DataFrame with Nan values
df = pd.DataFrame(index = pd.date_range('2017-01-01', '2017-01-10', freq='1d'))
# Average speed in miles per hour
df['A'] = np.random.randint(low=198, high=205, size=len(df.index))
df['B'] = np.random.random(size=len(df.index))*2
#Create dummy NaN value on 2 cells
df.iloc[2,1]=None
df.iloc[5,0]=None
print(df)
A B
2017-01-01 203.0 1.175224
2017-01-02 199.0 1.338474
2017-01-03 198.0 NaN
2017-01-04 198.0 0.652318
2017-01-05 199.0 1.577577
2017-01-06 NaN 0.234882
2017-01-07 203.0 1.732908
2017-01-08 204.0 1.473146
2017-01-09 198.0 1.109261
2017-01-10 202.0 1.745309
#Delete row with dummy value
df = df.dropna(how='any',axis=0)
print(df)
A B
2017-01-01 203.0 1.175224
2017-01-02 199.0 1.338474
2017-01-04 198.0 0.652318
2017-01-05 199.0 1.577577
2017-01-07 203.0 1.732908
2017-01-08 204.0 1.473146
2017-01-09 198.0 1.109261
2017-01-10 202.0 1.745309
Run Code Online (Sandbox Code Playgroud)
有关更多详细信息,请参阅参考
如果您的DataFrame一切正常,丢弃NaN应该就这么简单.如果仍然无法正常工作,请确保为列定义了正确的数据类型(请记住pd.to_numeric ...)
----清除空所有列-----
df = df.dropna(how='any',axis=0)
Run Code Online (Sandbox Code Playgroud)
---如果你想根据 1 列清除 NULL .---
df[~df['B'].isnull()]
Run Code Online (Sandbox Code Playgroud)
A B
2017-01-01 203.0 1.175224
2017-01-02 199.0 1.338474
**2017-01-03 198.0 NaN** clean
2017-01-04 198.0 0.652318
2017-01-05 199.0 1.577577
2017-01-06 NaN 0.234882
2017-01-07 203.0 1.732908
2017-01-08 204.0 1.473146
2017-01-09 198.0 1.109261
2017-01-10 202.0 1.745309
Run Code Online (Sandbox Code Playgroud)
请原谅任何错误。
删除所有空值 dropna() 方法会很有帮助
df.dropna(inplace=True)
Run Code Online (Sandbox Code Playgroud)
要删除包含特定空值的删除,请使用此代码
df.dropna(subset=['column_name_to_remove'], inplace=True)
Run Code Online (Sandbox Code Playgroud)