用dplyr在数据库中写表

jen*_*irf 8 postgresql r dplyr

有没有一种方法,使dplyr挂接到一个数据库之间传输数据的新表该数据库中,从来没有下载本地数据?

我想做一些事情:

tbl(con, "mytable") %>%
   group_by(dt) %>%
   tally() %>%
   write_to(name = "mytable_2", schema = "transformed")
Run Code Online (Sandbox Code Playgroud)

Jth*_*rpe 7

虽然我全心全意地同意学习SQL的建议,但是你可以利用这样的事实:dplyr在完全不得不使用数据并使用构建查询dplyr,添加TO TABLE子句,然后使用运行SQL语句dplyr::do(),如:

# CREATE A DATABASE WITH A 'FLIGHTS' TABLE
library(RSQLite)
library(dplyr)
library(nycflights13)
my_db <- src_sqlite("~/my_db.sqlite3", create = T)
flights_sqlite <- copy_to(my_db, flights, temporary = FALSE, indexes = list(
  c("year", "month", "day"), "carrier", "tailnum"))

# BUILD A QUERY
QUERY = filter(flights_sqlite, year == 2013, month == 1, day == 1) %>%
    select( year, month, day, carrier, dep_delay, air_time, distance) %>%
    mutate( speed = distance / air_time * 60) %>%
    arrange( year, month, day, carrier)

# ADD THE "TO TABLE" CLAUSE AND EXECUTE THE QUERY 
do(paste(unclass(QUERY$query$sql), "TO TABLE foo"))
Run Code Online (Sandbox Code Playgroud)

你甚至可以写一点功能来做到这一点:

to_table  <- function(qry,tbl)
    dplyr::do(paste(unclass(qry$query$sql), "TO TABLE",tbl))
Run Code Online (Sandbox Code Playgroud)

并将查询传递给该函数,如下所示:

filter(flights_sqlite, year == 2013, month == 1, day == 1) %>%
    select( year, month, day, carrier, dep_delay, air_time, distance) %>%
    mutate( speed = distance / air_time * 60) %>%
    arrange( year, month, day, carrier) %>%
    to_table('foo')
Run Code Online (Sandbox Code Playgroud)

  • 这基本上就是`compute()`的作用 (10认同)
  • 不再起作用了:`do(paste(unclass(QUERY $ query $ sql),"TO TABLE foo"))UseMethod("do_")中的错误:没有适用于"do_"的方法应用于类"字符"的对象"` (2认同)
  • @chrowe 我打算说阅读文档,但我发现我忘记记录那些我刚刚将它们添加到开发版本的功能:https://dbplyr.tidyverse.org/dev/reference/collapse.tbl_sql.html (2认同)