Don*_*ong 14 r data-manipulation dplyr tidyr
我的数据是有序观察,我希望在进行操作时尽可能保持顺序.
得到这个问题的答案,我把"B"放在数据框中的"A"之前.得到的宽数据按"名称"列排序,即首先是"A",然后是"B".
df = data.frame(name=c("B","B","A","A"),
group=c("g1","g2","g1","g2"),
V1=c(10,40,20,30),
V2=c(6,3,1,7))
gather(df, Var, Val, V1:V2) %>%
unite(VarG, Var, group) %>%
spread(VarG, Val)
name V1_g1 V1_g2 V2_g1 V2_g2
1 A 20 30 1 7
2 B 10 40 6 3
Run Code Online (Sandbox Code Playgroud)
有没有办法保持原始订单?像这样:
name V1_g1 V1_g2 V2_g1 V2_g2
1 B 10 40 6 3
2 A 20 30 1 7
Run Code Online (Sandbox Code Playgroud)
04/02编辑:我刚刚发现了dplyr::summarise排序.arrange(name, df$name)仍然可以恢复订单.但我想知道从包装设计中是否需要额外的分类?
df %>%
group_by(name) %>%
summarise(n()) %>%
name n()
1 A 2
2 B 2
Run Code Online (Sandbox Code Playgroud)
ber*_*ant 11
您可以根据原始数据框中的顺序按名称排序:
gather(df, Var, Val, V1:V2) %>%
unite(VarG, Var, group) %>%
spread(VarG, Val) %>%
arrange( order(match(name, df$name)))
# name V1_g1 V1_g2 V2_g1 V2_g2
# 1 B 10 40 6 3
# 2 A 20 30 1 7
Run Code Online (Sandbox Code Playgroud)
订单取自因子水平的顺序.
str(df)
'data.frame': 4 obs. of 4 variables:
$ name : Factor w/ 2 levels "A","B": 2 2 1 1
$ group: Factor w/ 2 levels "g1","g2": 1 2 1 2
$ V1 : num 10 40 20 30
$ V2 : num 6 3 1 7
Run Code Online (Sandbox Code Playgroud)
看到水平是"A","B".
因此,如果您将级别的顺序设置为它们显示的顺序,它将起作用:
df = data.frame(name=c("B","B","A","A"),
group=c("g1","g2","g1","g2"),
V1=c(10,40,20,30),
V2=c(6,3,1,7))
df %>%
mutate(name = factor(name,levels=unique(name))) %>%
mutate(group = factor(group,levels=unique(group))) %>%
gather(Var, Val, V1:V2) %>%
unite(VarG, Var, group) %>%
spread(VarG, Val)
Run Code Online (Sandbox Code Playgroud)
结果是:
name V1_g1 V1_g2 V2_g1 V2_g2
1 B 10 40 6 3
2 A 20 30 1 7
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
5966 次 |
| 最近记录: |