我想在数据表中添加一个列,其中包含y的每个值除以x(1或2)中相应条件的平均值,其中x2 = 1.对于以下数据,其中x = 1 y应除以1.4其中x = 2 y应除以1.
dt1 <- data.table(x=c("1","1","1","1","1","1","1","1","1","1","2","2","2","2","2","2","2","2","2","2"),
x2=c("1","1","2","2","2","2","3","3","3","3","1","1","2","2","2","2","3","3","3","3"),
y=c(1.41,1.39,1.9,2.1,0.9,1.1,3.1,2.9,3.9,4.1,0.9,1.1,1.9,2.1,0.9,1.1,3.1,2.9,3.9,4.1))
Run Code Online (Sandbox Code Playgroud)
我可以将x*x2 = 1的平均值写入新文件.
mean <- dt1 %>% filter(x2==1) %>% group_by(x) %>% summarise(mean(y))
Run Code Online (Sandbox Code Playgroud)
但我无法弄清楚如何网格指示命令调用正确的值.dt1%>%mutate(z = y /对'mean'的引用)
我想创建一个填充了我想要除以的值的新列,但是再一次我无法弄清楚如何从命令中调用分组标准.
t <- dt1 %>% mutate(T=ifelse(x==1,(filter(x2==1) %>% group_by(x=1) %>%
summarise(mean(y))),ifelse(x==1,(filter(x2==2) %>% group_by(x=2) %>%
summarise(mean(y))),NA)
Run Code Online (Sandbox Code Playgroud)
我不仅仅使用dplyr,但最近我一直在使用它.我对最简单的解决方案持开放态度.
akr*_*run 19
尝试
left_join(dt1,
dt1 %>%
filter(x2==1) %>%
group_by(x) %>%
summarise(a=mean(y)), by='x') %>%
mutate(z=y/a)%>%
head()
# x x2 y a z
#1 1 1 1.41 1.4 1.0071429
#2 1 1 1.39 1.4 0.9928571
#3 1 2 1.90 1.4 1.3571429
#4 1 2 2.10 1.4 1.5000000
#5 1 2 0.90 1.4 0.6428571
#6 1 2 1.10 1.4 0.7857143
Run Code Online (Sandbox Code Playgroud)
或使用 data.table
library(data.table)
dt2 <- dt1[x2==1,list(a=mean(y)) , by=x]
setkey(dt1, x)
res <- dt1[dt2][,z:=y/a]
head(res)
# x x2 y a z
#1: 1 1 1.41 1.4 1.0071429
#2: 1 1 1.39 1.4 0.9928571
#3: 1 2 1.90 1.4 1.3571429
#4: 1 2 2.10 1.4 1.5000000
#5: 1 2 0.90 1.4 0.6428571
#6: 1 2 1.10 1.4 0.7857143
Run Code Online (Sandbox Code Playgroud)
dplyr@aosmith建议的更紧凑的选项是
dt1 %>%
group_by(x) %>%
mutate(a=mean(y[x2==1]), z=y/a)
Run Code Online (Sandbox Code Playgroud)