我有一个我希望聚合的数据框,删除我想要用于聚合的列中不是NA(或选择唯一行)的行
即在下面我可能想删除数据框中每周有NA的每一行,并保持其他行不被修改:
OTHER_REV month quarter year week date days daysinmonth
1 2785013 1 2009 Q1 2009 2009-01-05 2009-01-05 2009-01-05 31
2 2785013 1 2009 Q1 2009 2009-01-12 2009-01-05 2009-01-05 31
3 2785013 1 2009 Q1 2009 2009-01-19 2009-01-05 2009-01-05 31
4 2785013 1 2009 Q1 2009 2009-01-26 2009-01-05 2009-01-05 31
5 2785013 1 NA QNA 2009 <NA> 2009-01-16 2009-01-16 31
6 2785013 1 NA QNA 2009 <NA> 2009-01-17 2009-01-17 31
Run Code Online (Sandbox Code Playgroud)
生产:
OTHER_REV month quarter year week date days daysinmonth
1 2785013 1 2009 Q1 2009 2009-01-05 2009-01-05 2009-01-05 31
2 2785013 1 2009 Q1 2009 2009-01-12 2009-01-05 2009-01-05 31
3 2785013 1 2009 Q1 2009 2009-01-19 2009-01-05 2009-01-05 31
4 2785013 1 2009 Q1 2009 2009-01-26 2009-01-05 2009-01-05 31
Run Code Online (Sandbox Code Playgroud)
我尝试使用grep和unique(数据$ stuff)的组合,并使用聚合但这些方法似乎都不起作用.
以下是数据的str:
'data.frame': 1896 obs. of 34 variables:
$ OTHER_REV : num 2785013 2785013 2785013 2785013 2785013 ...
$ month : num 1 1 1 1 1 1 1 1 1 1 ...
$ quarter :Class 'yearqtr' num [1:1896] 2009 2009 2009 2009 NA ...
$ year : num 2009 2009 2009 2009 2009 ...
$ week : Date, format: "2009-01-05" "2009-01-12" "2009-01-19" "2009-01-26" ...
$ date : Date, format: "2009-01-05" "2009-01-05" "2009-01-05" "2009-01-05" ...
$ days : Date, format: "2009-01-05" "2009-01-05" "2009-01-05" "2009-01-05" ...
$ daysinmonth : int 31 31 31 31 31 31 31 31 31 31 ...
Run Code Online (Sandbox Code Playgroud)
在df $ week上调用唯一产生:
[1] "2009-01-05" "2009-01-12" "2009-01-19" "2009-01-26" NA "2009-02-02"......
Run Code Online (Sandbox Code Playgroud)
试试这个:
data[ ! is.na(data$week), ]
Run Code Online (Sandbox Code Playgroud)
使用data.table的类似答案有点简单:
data[ ! is.na(week) ]
Run Code Online (Sandbox Code Playgroud)