小编lar*_*y77的帖子

R 向后移植和无效签名

我运行 debian stable (bullseye) 并使用官方 R 向后移植。看

https://cloud.r-project.org/bin/linux/debian/

我向 mu 存储库添加了一行

$ cat /etc/apt/sources.list | grep r-project
deb http://cloud.r-project.org/bin/linux/debian bullseye-cran40/

Run Code Online (Sandbox Code Playgroud)

直到今天一切都很好。现在，当我更新时，当我运行 sudo apt update 时，我收到有关 R 存储库签名的错误，请参阅

W: An error occurred during the signature verification. The repository is not updated and the previous index files will be used. GPG error: http://cloud.r-project.org/bin/linux/debian bullseye-cran40/ InRelease: The following signatures were invalid: EXPKEYSIG FCAE2A0E115C3D8A Johannes Ranke (Wissenschaftlicher Berater) <johannes.ranke@jrwb.de>
W: Failed to fetch http://cloud.r-project.org/bin/linux/debian/bullseye-cran40/InRelease  The following signatures were invalid: EXPKEYSIG FCAE2A0E115C3D8A Johannes Ranke (Wissenschaftlicher …

Run Code Online (Sandbox Code Playgroud)

linux debian r signature

lar*_*y77

lucky-day

11
推荐指数

1
解决办法

1502
查看次数

在 dplyr 中按向量划分选定的列

这在基础 R 中必须很简单，但它让我对 dplyr 发疯（这总体上让我的生活变得更好！）。假设你有以下小标题

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union



df1 <- tibble(x=seq(5)*19, a1=seq(5)*1, a2=seq(5)*2, a3=seq(5)*4)

df1
#> # A tibble: 5 x 4
#>       x    a1    a2    a3
#>   <dbl> <dbl> <dbl> <dbl>
#> 1    19     1     2     4
#> 2    38     2     4     8
#> 3    57     3     6    12
#> …

Run Code Online (Sandbox Code Playgroud)

r dplyr

lar*_*y77

2020 06-11

8
推荐指数

1
解决办法

262
查看次数

dplyr across + mutate + 条件来选择列

我确信解决方案是单行，但我正在用头撞墙。\n请参阅帖子末尾的非常短的 reprex；我如何告诉 dplyr 我只想将没有 NA 的列加倍？

非常感谢

library(dplyr)\n#> \n#> Attaching package: \'dplyr\'\n#> The following objects are masked from \'package:stats\':\n#> \n#>     filter, lag\n#> The following objects are masked from \'package:base\':\n#> \n#>     intersect, setdiff, setequal, union\n\n\ndf <- tibble(x=1:10, y=101:110,\n             w=c(6,NA,4,NA, 5,0,NA,4,8,17 ),\n             z=c(2,3,4,NA, 5,10,22,34,58,7 ),\n             k=rep("A",10))\n\n\ndf\n#> # A tibble: 10 x 5\n#>        x     y     w     z k    \n#>    <int> <int> <dbl> <dbl> <chr>\n#>  1     1   101     6     2 A    \n#>  2     2   102    NA     3 A    \n#>  3     3   103 …

Run Code Online (Sandbox Code Playgroud)

r dplyr across

lar*_*y77

lucky-day

5
推荐指数

2
解决办法

2928
查看次数

Rplotly：自定义悬停（信息和文本）

我想自定义当我将鼠标悬停在栏上时我在情节中看到的内容。

请查看帖子末尾的 reprex。我看了一下

如何设置不同的文本和hoverinfo文本

https://community.rstudio.com/t/changing-hovertext-in-plotly/71736

但我一定犯了一些错误。我希望当我将鼠标悬停在栏上时只看到变量“macro_sector”和“amount”，而栏上根本没有静态文本。我怎样才能做到这一点？非常感谢

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(plotly)
#> Loading required package: ggplot2
#> 
#> Attaching package: 'plotly'
#> The following object is masked from 'package:ggplot2':
#> 
#>     last_plot
#> The following object is masked from 'package:stats':
#> 
#>     filter
#> The following object is masked from 'package:graphics':
#> …

Run Code Online (Sandbox Code Playgroud)

r plotly r-plotly

lar*_*y77

lucky-day

5
推荐指数

1
解决办法

3597
查看次数

R 中地图投影中不需要的水平线

几行代码暴露了我的问题。当我处理世界地图并引入投影时，我总是会得到一些看起来很奇怪的水平线。请查看 https://www.rdocumentation.org/packages/ggplot2/versions/1.0.0/topics/coord_map

我以新西兰为例

library(ggplot2)

nz <- map_data("nz")
# Prepare a map of NZ
nzmap <- ggplot(nz, aes(x = long, y = lat, group = group)) +
geom_polygon(fill = "white", colour = "black")

# Plot it in cartesian coordinates
nzmap
# With correct mercator projection
nzmap + coord_map()

Run Code Online (Sandbox Code Playgroud)

效果很好。现在让我们对世界做同样的事情

world <- map_data("world")
# Prepare a map of the world
worldmap <- ggplot(world, aes(x = long, y = lat, group = group)) +
geom_polygon(fill = "white", colour = "black")

# Plot it …

Run Code Online (Sandbox Code Playgroud)

r ggplot2

lar*_*y77

2018 06-08

4
推荐指数

1
解决办法

1889
查看次数

Dplyr，非标准评估和海象算子和卷曲卷曲

一个真正的问题。每当我需要编写 dplyr 函数时，我都在旁听。我知道 curl-curly 运算符可以简化很多任务。

https://www.tidyverse.org/blog/2019/06/rlang-0-4-0/

和

https://www.tidyverse.org/blog/2020/02/glue-strings-and-tidy-eval/

我不清楚什么时候使用简单的“=”和海象运算符“:=”。例如，考虑帖子末尾的片段。函数mean_by 和mean_by2 的不同只是因为前者依赖于“=”，后者依赖于“:=”，但结果是一样的。但是，如果我尝试编写一个依赖于 mutate 来添加新列的函数，如果我在创建新列时使用“=”而不是“:=”，我会收到一条错误消息。有人可以向我澄清为什么不同吗？这是否意味着使用 Walrus 运算符而不是“=”更安全？

谢谢！

library(tidyverse)


mean_by <- function(data, by, var) {
  data %>%
    group_by({{ by }}) %>%
    summarise(avg = mean({{ var }}, na.rm = TRUE))
}



mean_by2 <- function(data, by, var) {
  data %>%
    group_by({{ by }}) %>%
    summarise(avg := mean({{ var }}, na.rm = TRUE))
}



add_new_col <- function(data, old_col, new_col){

    data %>%
        mutate({{new_col}}:={{old_col}})


}


iris %>% mean_by(Species, Sepal.Width)
#> # A tibble: 3 x 2
#> …

Run Code Online (Sandbox Code Playgroud)

r dplyr nse rlang

lar*_*y77

lucky-day

4
推荐指数

1
解决办法

603
查看次数

在 R 中将平面文件保存为 SQL 数据库，而不将其 100% 加载到 RAM 中

我希望我要写的内容有意义。\n如果你看看

r语言如何处理50GB大的csv文件？

解释了如何查询 \xc3\xa0 la SQL（来自 R 的 csv 文件）。\n就我而言，我有大量数据存储为大型（或大于我的 RAM）平面文件。

例如，我想将其中一个存储为 SQLite 数据库，而不将其完全加载到内存中。\n想象一下，如果您可以自动读取该文件中适合您的 RAM 的有限块，将其存储到 SQL 中，然后释放一些内存，处理下一个块，依此类推，直到所有文件都在数据库中。\n这在 R 中可行吗？如果表格可以存储为 tibble，那就更好了，但这并不重要。\n任何建议表示赞赏。\n谢谢！

sqlite r sqldf dbplyr

lar*_*y77

lucky-day

4
推荐指数

1
解决办法

1026
查看次数

R：查看两个简单字符串的唯一元素时出现奇怪的结果

我对所看到的感到非常困惑。\n我读了一个 Excel 文件，当我查看一列字符串中的唯一值时，我不明白结果。

我可以在一个最小的表示中重现这一点（见下文）：为什么 dd 有两个独特的元素，而 dd2 只有一个？

任何建议表示赞赏。

dd <- c("\xef\xbb\xbfGrant", "Grant")\n\n\ndd2 <- c("Grant", "Grant")\n\nunique(dd)\n#> [1] "\xef\xbb\xbfGrant" "Grant"\nlength(unique(dd))\n#> [1] 2\n\nunique(dd2)\n#> [1] "Grant"\nlength(unique(dd2))\n#> [1] 1\n\nsessionInfo()\n#> R version 4.1.1 (2021-08-10)\n#> Platform: x86_64-pc-linux-gnu (64-bit)\n#> Running under: Debian GNU/Linux 11 (bullseye)\n#> \n#> Matrix products: default\n#> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0\n#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0\n#> \n#> locale:\n#>  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              \n#>  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    \n#>  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   \n#>  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 \n#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            \n#> [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       \n#> \n#> attached base packages:\n#> [1] …

Run Code Online (Sandbox Code Playgroud)

string r vector unique

lar*_*y77

lucky-day

4
推荐指数

1
解决办法

35
查看次数

R + dplyr：Tibble 中行的部分重复数据删除

一个非常常见的问题是如何删除R 中数据框中的所有重复行，这可以使用多种工具来完成（我喜欢 dplyr+distinct）。

但是，如果您的数据集包含多个重复行，但您不想删除所有重复行，而只想删除某些变量组合的重复行，该怎么办？

我不知道如何实现这一点，所以欢迎任何建议。

请查看帖子末尾的 reprex。

谢谢！

library(dplyr)\n#> \n#> Attaching package: \'dplyr\'\n#> The following objects are masked from \'package:stats\':\n#> \n#>     filter, lag\n#> The following objects are masked from \'package:base\':\n#> \n#>     intersect, setdiff, setequal, union\n\n\ndf <- tibble(x=rep(seq(5), 3), y=rep(LETTERS[1:5],3),\n             z=c(rep(c("h","j","k","t","u"), 2), LETTERS[1:5])\n             )\ndf\n#> # A tibble: 15 \xc3\x97 3\n#>        x y     z    \n#>    <int> <chr> <chr>\n#>  1     1 A     h    \n#>  2     2 B     j    \n#>  3     3 C     k    \n#>  4     4 …

Run Code Online (Sandbox Code Playgroud)

r duplicates dplyr tidyverse

lar*_*y77

lucky-day

3
推荐指数

1
解决办法

41
查看次数

散列每一行

我正在使用新生成的 dplyr 1.0.0 和摘要包来生成小标题中每一行的散列。

我知道

在 R 中使用 dplyr 和摘要向每一行添加哈希

但我想使用rowwise()dplyr 1.0.0 中的改进。

请参阅下面的示例。任何人都知道它为什么会失败？我应该被允许消化一行，其中条目是不同类型的。

library(dplyr)
library(digest)

df <- tibble(
    student_id = letters[1:4],
    student_id2 = letters[9:12],
    test1 = 10:13, 
    test2 = 20:23, 
    test3 = 30:33, 
    test4 = 40:43
)

df
#> # A tibble: 4 x 6
#>   student_id student_id2 test1 test2 test3 test4
#>   <chr>      <chr>       <int> <int> <int> <int>
#> 1 a          i              10    20    30    40
#> 2 b          j              11    21    31    41
#> 3 …

Run Code Online (Sandbox Code Playgroud)

r dplyr rowwise tibble

lar*_*y77

2020 06-04

1
推荐指数

1
解决办法

156
查看次数

R+Shiny：读取文件并使用其内容

请在下面找到我在网上找到的脚本示例（可能来自 Rstudio），用于创建一个简单的应用程序来读取各种平面文件并输出表格。我添加了一些创建应用程序可以读取的文件“test_input_file.csv”的内容。

我迷失了一个非常简单的任务：读取 csv 文件后，我有一个 tibble 并将其呈现为表格。我如何直接访问这个 tibble 来用它做其他事情？例如用plotly绘制它，进行一些统计等......？非常感谢

library(shiny)
library(tidyverse)


tt <- tibble(AA=seq(10), BB=seq(10)*2, CC=seq(10)*3 )

write_csv(tt, "test_input_file.csv")

rm(tt)

# Define UI for data upload app ----
ui <- fluidPage(

  # App title ----
  titlePanel("Uploading Files"),

  # Sidebar layout with input and output definitions ----
  sidebarLayout(

    # Sidebar panel for inputs ----
    sidebarPanel(

      # Input: Select a file ----
      fileInput("file1", "Choose CSV File",
                multiple = FALSE,
                accept = c("text/csv",
                         "text/comma-separated-values,text/plain",
                         ".csv", "space")),

      # Horizontal line ----
      tags$hr(),

      # Input: …

Run Code Online (Sandbox Code Playgroud)

r shiny shinyapps

lar*_*y77

lucky-day

0
推荐指数

1
解决办法

2477
查看次数