我试图计算每月独特的"新"用户数.New是一个之前没有出现的用户(从一开始)我也在尝试计算上个月没有出现的唯一用户数.
原始数据看起来像
library(dplyr)
date <- c("2010-01-10","2010-02-13","2010-03-22","2010-01-11","2010-02-14","2010-03-23","2010-01-12","2010-02-14","2010-03-24")
mth <- rep(c("2010-01","2010-02","2010-03"),3)
user <- c("123","129","145","123","129","180","180","184","145")
dt <- data.frame(date,mth,user)
dt <- dt %>% arrange(date)
dt
date mth user
1 2010-01-10 2010-01 123
2 2010-01-11 2010-01 123
3 2010-01-12 2010-01 180
4 2010-02-13 2010-02 129
5 2010-02-14 2010-02 129
6 2010-02-14 2010-02 184
7 2010-03-22 2010-03 145
8 2010-03-23 2010-03 180
9 2010-03-24 2010-03 145
Run Code Online (Sandbox Code Playgroud)
答案应该是这样的
new <- c(2,2,2,2,2,2,1,1,1)
totNew <- c(2,2,2,4,4,4,5,5,5)
notLastMonth <- c(2,2,2,2,2,2,2,2,2)
tmp <- cbind(dt,new,totNew,notLastMonth)
tmp
date mth user new totNew notLastMonth …Run Code Online (Sandbox Code Playgroud) 我有一个大型数据文件,其中所有日期都已加载为字符.我想将所有日期列更改为日期格式.大多数日期具有"%y%m%d"格式,一些具有"%Y%m%d"格式.有25列日期,因此单独更改每一列是低效的.
我可以
df$DATE1 <- as.Date(df$DATE1, format ="%y%m%d")
df$DATE2 <- as.Date(df$DATE2, format ="%y%m%d")
Run Code Online (Sandbox Code Playgroud)
等,但编码非常糟糕.
我尝试了以下代码,但是没有用.这假设所有日期的格式为"%y%m%d".使用grep("DATE",名称(df))将获得所有日期列
df[ , grep("DATE", names(df))] <- as.Date(df[ , grep("DATE", names(df))], "%y%m%d")
Run Code Online (Sandbox Code Playgroud) 我想重命名在 dplyr 中使用 group_by 创建的列。创建的名称format(date2, "%Y-%m")不是很有帮助。我已经尝试了几件事。我希望新名称为“yrMth”
df <- data.frame(Person = c(rep("abc",3), rep("eee", 5)),
date = c("4/1/2016", "4/3/2016", "4/12/2016", "5/3/2016", "5/4/2016","5/10/2016","5/6/2016", "5/11/2016"),
account = c("123","123","123","222","222","333","222","333"), stringsAsFactors = F)
df$date2 <- mdy(df$date)
df %>%
group_by(format(date2, "%Y-%m"))
Person date account date2 `format(date2, "%Y-%m")`
<chr> <chr> <chr> <date> <chr>
1 abc 4/1/2016 123 2016-04-01 2016-04
2 abc 4/3/2016 123 2016-04-03 2016-04
3 abc 4/12/2016 123 2016-04-12 2016-04
4 eee 5/3/2016 222 2016-05-03 2016-05
5 eee 5/4/2016 222 2016-05-04 2016-05
6 eee 5/10/2016 …Run Code Online (Sandbox Code Playgroud) 我想创建一个名为“X”的新变量,它是“B”和“D”的总和
type <- c( "A", "B","C","D","E")
cnt <- c(2,5,3,7,8)
df <- data.frame(type,cnt)
> df
type cnt
1 A 2
2 B 5
3 C 3
4 D 7
5 E 8
Run Code Online (Sandbox Code Playgroud)
期望的输出是
> df
type cnt
1 A 2
2 B 5
3 C 3
4 D 7
5 E 8
6 X 12
Run Code Online (Sandbox Code Playgroud)
如果我们添加另一个分组变量(例如日期),如何扩展它?想要每天添加 X
date <- c("2022-01-01","2022-01-01","2022-01-01","2022-01-01","2022-01-01","2022-01-02","2022-01-02","2022-01-02","2022-01-02","2022-01-02")
type <- c("A", "B","C","D","E","A", "B","C","D","E")
cnt <- c(2,5,3,7,8, 1,9,8,2,5)
df <- data.frame(date,type,cnt)
df
date type cnt
1 2022-01-01 A 2
2 …Run Code Online (Sandbox Code Playgroud)