在 tidyverse 中打印中间结果而不破坏管道

Rai*_*lde 7 r dplyr tidyverse

是否有一个命令可以添加到 tidyverse 管道中,该命令不会中断流程,但会产生一些副作用,例如打印出来的东西。我想到的用例是这样的。如果是管道

data %>%
  mutate(new_var = <some time consuming operation>) %>%
  mutate(new_var2 = <some other time consuming operation>) %>%
  ...
Run Code Online (Sandbox Code Playgroud)

我想向管道添加一些不会修改最终结果的命令,但会打印出一些进度或事情的状态。也许是这样的:

data %>%
  mutate(new_var = <some time consuming operation>) %>%
  command_x(print("first operation done")) %>%
  mutate(new_var2 = <some other time consuming operation>) %>%
  ...
Run Code Online (Sandbox Code Playgroud)

command_x已经存在这样的吗?

MrF*_*ick 9

您可以轻松编写自己的函数

pass_through <- function(data, fun) {fun(data); data}
Run Code Online (Sandbox Code Playgroud)

并像这样使用它

mtcars %>% pass_through(. %>% ncol %>% print) %>% nrow
Run Code Online (Sandbox Code Playgroud)

这里我们使用. %>%语法来创建一个匿名函数。您还可以更明确地编写自己的

mtcars %>% pass_through(function(x) print(ncol(x))) %>% nrow
Run Code Online (Sandbox Code Playgroud)


Jon*_*løv 5

For the specific case of printing an intermediate step in the pipeline, just use %>% print() %>%. E.g.,

mtcars %>%
  filter(cyl == 4) %>%
  print() %>%
  summarise(mpg = mean(mpg))
Run Code Online (Sandbox Code Playgroud)

For a simple status message, you'd do:

pipe_message = function(.data, status) {message(status); .data}
mtcars %>%
  filter(cyl == 4) %>%
  pipe_message("first operation done") %>%
  select(cyl)
Run Code Online (Sandbox Code Playgroud)

See the answer by @MrFlick for a more general solution for non-print functions.


Git*_*er0 5

您可以使用匿名函数即时执行以下操作:

mtcars %>% ( function(x){print(x); return(x)} ) %>% nrow()
Run Code Online (Sandbox Code Playgroud)