Oma*_*les 10 r subset filter dplyr
我有这些数据:(完成Dicember)
date sessions
1 2014-12-01 1932
2 2014-12-02 1828
3 2014-12-03 2349
4 2014-12-04 8192
5 2014-12-05 3188
6 2014-12-06 3277
Run Code Online (Sandbox Code Playgroud)
并且需要对此进行子菜单/过滤,例如从"2014-12-05"到"2014-12-25"
我现在可以使用运算符":"创建序列.
示例:b < - c(1:5)
但是如何过滤序列?我试过这个
NewDate <- filter(Dates, date("2014-12-05":"2014-12-12"))
Run Code Online (Sandbox Code Playgroud)
但是说:
错误:意外符号:"NewDate < - 过滤器(日期,日期("2014-12-05":"2014-12-12")NewDate"
jaz*_*rro 19
如果你想使用dplyr,你可以尝试这样的事情.
mydf <- structure(list(date = structure(c(16405, 16406, 16407, 16408,
16409, 16410), class = "Date"), sessions = c(1932L, 1828L, 2349L,
8192L, 3188L, 3277L)), .Names = c("date", "sessions"), row.names = c("1",
"2", "3", "4", "5", "6"), class = "data.frame")
# Create date object
mydf$date <- as.Date(mydf$date)
filter(mydf, between(date, as.Date("2014-12-02"), as.Date("2014-12-05")))
#If you avoid using `between()`, the code is simpler.
filter(mydf, date >= "2014-12-02", date <= "2014-12-05")
filter(mydf, date >= "2014-12-02" & date <= "2014-12-05")
# date sessions
#1 2014-12-02 1828
#2 2014-12-03 2349
#3 2014-12-04 8192
#4 2014-12-05 3188
Run Code Online (Sandbox Code Playgroud)
jal*_*pic 17
你可以用 subset
生成示例数据:
temp<-
read.table(text="date sessions
2014-12-01 1932
2014-12-02 1828
2014-12-03 2349
2014-12-04 8192
2014-12-05 3188
2014-12-06 3277", header=T)
Run Code Online (Sandbox Code Playgroud)
确保它的日期格式:
temp$date <- as.Date(temp$date, format= "%Y-%m-%d")
temp
# date sessions
# 1 2014-12-01 1932
# 2 2014-12-02 1828
# 3 2014-12-03 2349
# 4 2014-12-04 8192
# 5 2014-12-05 3188
# 6 2014-12-06 3277
Run Code Online (Sandbox Code Playgroud)
使用subset:
subset(temp, date> "2014-12-03" & date < "2014-12-05")
Run Code Online (Sandbox Code Playgroud)
这使:
# date sessions
# 4 2014-12-04 8192
Run Code Online (Sandbox Code Playgroud)
你也可以用[]:
temp[(temp$date> "2014-12-03" & temp$date < "2014-12-05"),]
Run Code Online (Sandbox Code Playgroud)
一个选项使用 data.table
library(data.table)
setDT(df)[date %between% c('2014-12-02', '2014-12-05')]
# date sessions
#1: 2014-12-02 1828
#2: 2014-12-03 2349
#3: 2014-12-04 8192
#4: 2014-12-05 3188
Run Code Online (Sandbox Code Playgroud)
即使"日期"是"字符"列,这也应该有效
df$date <- as.character(df$date)
setDT(df)[date %between% c('2014-12-02', '2014-12-05')]
# date sessions
#1: 2014-12-02 1828
#2: 2014-12-03 2349
#3: 2014-12-04 8192
#4: 2014-12-05 3188
Run Code Online (Sandbox Code Playgroud)
如果我们想要排除该范围的子集
setDT(df)[between(date, '2014-12-02', '2014-12-05', incbounds=FALSE)]
# date sessions
#1: 2014-12-03 2349
#2: 2014-12-04 8192
Run Code Online (Sandbox Code Playgroud)
df <- structure(list(date = structure(c(16405, 16406, 16407, 16408,
16409, 16410), class = "Date"), sessions = c(1932L, 1828L, 2349L,
8192L, 3188L, 3277L)), .Names = c("date", "sessions"), row.names = c("1",
"2", "3", "4", "5", "6"), class = "data.frame")
Run Code Online (Sandbox Code Playgroud)