我对R很新,我试图理解%>%运算符和" ."(点)占位符的用法.作为一个简单示例,以下代码有效
library(magrittr)
library(ensurer)
ensure_data.frame <- ensures_that(is.data.frame(.))
data.frame(x = 5) %>% ensure_data.frame
Run Code Online (Sandbox Code Playgroud)
但是,以下代码失败
ensure_data.frame <- ensures_that(. %>% is.data.frame)
data.frame(x = 5) %>% ensure_data.frame
Run Code Online (Sandbox Code Playgroud)
我现在将占位符管道到is.data.frame方法中.
我猜这是我对点占位符的限制/解释的理解是滞后的,但有人可以澄清一下吗?
在尝试获取分组滞后变量(不可能仅使用lag)的过程中,建议的解决方案是将数据拉出,滞后于不同的行,然后重新加入它.
我更喜欢在不创建中间对象的情况下这样做,并且希望在链中间进行.然而,它似乎没有像我期望的那样工作,并且问题似乎是.在left_join中使用嵌套链之间的一些交互.
require(tidyverse)
#> Loading required package: tidyverse
df <- data.frame(Team = c("A", "A", "A", "A", "B", "B", "B", "C", "C", "D", "D"),
Date = c("2016-05-10","2016-05-10", "2016-05-10", "2016-05-10",
"2016-05-12", "2016-05-12", "2016-05-12",
"2016-05-15","2016-05-15",
"2016-05-30", "2016-05-30"),
Points = c(1,4,3,2,1,5,6,1,2,3,9)
)
#This works:
df %>% left_join(x = ., y = df %>%
distinct(Team, Date) %>%
mutate(Date_Lagged = lag(Date)))
#> Joining, by = c("Team", "Date")
#> Team Date Points Date_Lagged
#> 1 A 2016-05-10 1 <NA>
#> 2 A 2016-05-10 …Run Code Online (Sandbox Code Playgroud) 当我按某些属性对我的数据进行分组时,我想添加一个“总计”行来提供比较基线。让我们按汽缸和化油器对 mtcars 进行分组,例如:
by_cyl_carb <- mtcars %>%
group_by(cyl, carb) %>%
summarize(median_mpg = median(mpg),
avg_mpg = mean(mpg),
count = n())
Run Code Online (Sandbox Code Playgroud)
...产生这些结果:
> by_cyl_carb
# A tibble: 9 x 5
# Groups: cyl [?]
cyl carb median_mpg avg_mpg count
<dbl> <dbl> <dbl> <dbl> <int>
1 4 1 27.3 27.6 5
2 4 2 25.2 25.9 6
3 6 1 19.8 19.8 2
4 6 4 20.1 19.8 4
5 6 6 19.7 19.7 1
6 8 2 17.1 17.2 4
7 8 …Run Code Online (Sandbox Code Playgroud)