nac*_*cab 10 r dataframe dplyr
我有这个数据帧:
x <- data.frame(
name = rep(letters[1:4], each = 2),
condition = rep(c("A", "B"), times = 4),
value = c(2,10,4,20,8,40,20,100)
)
# name condition value
# 1 a A 2
# 2 a B 10
# 3 b A 4
# 4 b B 20
# 5 c A 8
# 6 c B 40
# 7 d A 20
# 8 d B 100
Run Code Online (Sandbox Code Playgroud)
我想组名称和分列的值与condition == "B"那些用condition == "A",要得到这样的:
data.frame(
name = letters[1:4],
value = c(5,5,5,5)
)
# name value
# 1 a 5
# 2 b 5
# 3 c 5
# 4 d 5
Run Code Online (Sandbox Code Playgroud)
我知道这样的事情可以让我非常接近:
x$value[which(x$condition == "B")]/x$value[which(x$condition == "A")]
Run Code Online (Sandbox Code Playgroud)
但我想知道是否有一个简单的方法来使用dplyr(我的数据框是一个玩具示例,我通过链接多个group_by和summarise调用来达到它).
Ste*_*pré 11
尝试:
x %>%
group_by(name) %>%
summarise(value = value[condition == "B"] / value[condition == "A"])
Run Code Online (Sandbox Code Playgroud)
这使:
#Source: local data frame [4 x 2]
#
# name value
# (fctr) (dbl)
#1 a 5
#2 b 5
#3 c 5
#4 d 5
Run Code Online (Sandbox Code Playgroud)
我用的spread是tidyr.
library(dplyr)
library(tidyr)
x %>%
spread(condition, value) %>%
mutate(value = B/A)
name A B value
1 a 2 10 5
2 b 4 20 5
3 c 8 40 5
4 d 20 100 5
Run Code Online (Sandbox Code Playgroud)
然后,您可以select(-A, -B)删除额外的列.