我有一个数据框,其中包含几个带日期的列
col1<-seq( as.Date("2011-07-01"), by=20, len=10)
col2<-seq( as.Date("2011-09-01"), by=7, len=10)
col3<-seq( as.Date("2011-08-01"), by=1, len=10)
data.frame(col1,col2,col3)
Run Code Online (Sandbox Code Playgroud)
数据框如下所示:
col1 col2 col3
1 2011-07-01 2011-09-01 2011-08-01
2 2011-07-21 2011-09-08 2011-08-02
3 2011-08-10 2011-09-15 2011-08-03
4 2011-08-30 2011-09-22 2011-08-04
5 2011-09-19 2011-09-29 2011-08-05
6 2011-10-09 2011-10-06 2011-08-06
7 2011-10-29 2011-10-13 2011-08-07
8 2011-11-18 2011-10-20 2011-08-08
9 2011-12-08 2011-10-27 2011-08-09
10 2011-12-28 2011-11-03 2011-08-10
Run Code Online (Sandbox Code Playgroud)
我试图将它们合并为一列,以便
A.每行只剩下最低(最早)的日期,而其他日期则被忽略
1 2011-07-01
2 2011-07-21
3 2011-08-03
4 2011-08-04
5 2011-08-05
6 2011-08-06
7 2011-08-07
8 2011-08-08
9 2011-08-09
10 2011-08-10
Run Code Online (Sandbox Code Playgroud)
B.每行仅保留最高(最新)日期
1 2011-09-01
2 2011-09-08
3 2011-09-15
4 2011-09-22
5 2011-09-29
6 2011-10-09
7 2011-10-29
8 2011-11-18
9 2011-12-08
10 2011-12-28
Run Code Online (Sandbox Code Playgroud)
NA如果NA遇到真实数据集,则应该忽略它,除非所有列都有特定行的缺失值,在这种情况下NA也会生成.
有什么想法吗?
pmin并pmax在这里有所帮助:
do.call(pmin, dat)
# [1] "2011-07-01" "2011-07-21" "2011-08-03" "2011-08-04" "2011-08-05"
# [6] "2011-08-06" "2011-08-07" "2011-08-08" "2011-08-09" "2011-08-10"
do.call(pmax, dat)
# [1] "2011-09-01" "2011-09-08" "2011-09-15" "2011-09-22" "2011-09-29"
# [6] "2011-10-09" "2011-10-29" "2011-11-18" "2011-12-08" "2011-12-28"
Run Code Online (Sandbox Code Playgroud)
这也适用于NA价值观,例如:
do.call(pmin, c(dat, na.rm=TRUE) )
Run Code Online (Sandbox Code Playgroud)
您还可以选择要分析的特定列,例如:
do.call(pmin, c(dat[c("col1","col2","col3")], na.rm=TRUE) )
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
398 次 |
| 最近记录: |