Kon*_*rad 3 sorting r dataframe dplyr
我正在使用下面的代码生成一个简单的汇总表:
# Data
data("mtcars")
# Lib
require(dplyr)
# Summary
mt_sum <- mtcars %>%
group_by(am) %>%
summarise_each(funs(min, mean, median, max), mpg, cyl) %>%
mutate(am = as.character(am)) %>%
left_join(y = as.data.frame(table(mtcars$am),
stringsAsFactors = FALSE),
by = c("am" = "Var1"))
Run Code Online (Sandbox Code Playgroud)
代码产生了预期的结果:
> head(mt_sum)
Source: local data frame [2 x 10]
am mpg_min cyl_min mpg_mean cyl_mean mpg_median cyl_median mpg_max cyl_max Freq
(chr) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (dbl) (int)
1 0 10.4 4 17.14737 6.947368 17.3 8 24.4 8 19
2 1 15.0 4 24.39231 5.076923 22.8 4 33.9 8 13
Run Code Online (Sandbox Code Playgroud)
但是,我对列的排序方式不满意.特别是,我想:
按名称排序列
实现这一目标通过select()
在dplyr
所需的顺序看起来像这样:
> names(mt_sum)[order(names(mt_sum))]
[1] "am" "cyl_max" "cyl_mean" "cyl_median" "cyl_min" "Freq" "mpg_max"
[8] "mpg_mean" "mpg_median" "mpg_min"
Run Code Online (Sandbox Code Playgroud)
理想情况下,我想通过names(mt_sum)[order(names(mt_sum))]
对列进行排序的方式select()
.但代码:
mt_sum <- mtcars %>%
group_by(am) %>%
summarise_each(funs(min, mean, median, max), mpg, cyl) %>%
mutate(am = as.character(am)) %>%
left_join(y = as.data.frame(table(mtcars$am),
stringsAsFactors = FALSE),
by = c("am" = "Var1")) %>%
select(names(.)[order(names(.))])
Run Code Online (Sandbox Code Playgroud)
将返回预期的错误:
Run Code Online (Sandbox Code Playgroud)Error: All select() inputs must resolve to integer column positions. The following do not: * names(.)[order(names(.))]
在我的实际数据中,我正在生成大量的汇总列.因此我的问题是,我如何动态地将已排序的列名称传递给select()
in,dplyr
以便它能理解并应用于data.frame
手头?
我的重点是找出一种将动态生成的列名称传递给的方法select()
.我知道我可以在列进行排序base
或通过输入名称,如讨论在这里.
你肯定是在正确的道路上.
mt_sum <- mtcars %>%
group_by(am) %>%
summarise_each(funs(min, mean, median, max), mpg, cyl) %>%
mutate(am = as.character(am)) %>%
left_join(y = as.data.frame(table(mtcars$am),
stringsAsFactors = FALSE),
by = c("am" = "Var1")) %>%
.[, names(.)[order(names(.))]]
Run Code Online (Sandbox Code Playgroud)
所有你需要的只是:
mt_sum %>% select(order(names(.)))
#Source: local data frame [2 x 10]
#
# am cyl_max cyl_mean cyl_median cyl_min Freq mpg_max mpg_mean mpg_median mpg_min
# (chr) (dbl) (dbl) (dbl) (dbl) (int) (dbl) (dbl) (dbl) (dbl)
#1 0 8 6.947368 8 4 19 24.4 17.14737 17.3 10.4
#2 1 8 5.076923 4 4 13 33.9 24.39231 22.8 15.0
Run Code Online (Sandbox Code Playgroud)
它工作,因为order
返回整数列位置,根据需要select
.
归档时间: |
|
查看次数: |
1243 次 |
最近记录: |