如何在 R 包中存储常用数据或参数？

Question

如何在 R 包中存储常用数据或参数？

我正在编写一个 R 包，并且有几个数值向量，用户经常将其用作各种包函数的参数。将这些向量存储在包中以便用户可以轻松访问它们的最佳方式是什么？

我的一个想法是将每个向量保存为inst/data中的数据文件。然后，用户可以在需要时使用数据文件的名称来代替向量（至少，我可以在开发过程中这样做）。我喜欢这个想法，但不确定这个解决方案是否会违反 CRAN 规则/规范或导致任何问题。

# To create one such vector as a data file
octants <- c(90, 135, 180, 225, 270, 315, 360, 45)
devtools::use_data(octants)
# To access this vector in usage
my_function(data, octants)

Run Code Online (Sandbox Code Playgroud)

我的另一个想法是创建一个单独的函数来返回所需的向量。然后用户就可以在需要时调用适当的函数。由于某种原因，这可能比数据更好，但我担心用户忘记()函数名称后面的内容。

# To create the vector within a function
octants <- function() c(90, 135, 180, 225, 270, 315, 360, 45}
# To access this vector in usage
my_function(data, octants()) # works
my_function(data, octants) # doesn't work

Run Code Online (Sandbox Code Playgroud)

有谁知道哪种解决方案更可取或有更好的替代方案吗？

Answer 1

Ced*_*ric 5

老实说，我花了很长时间仔细阅读手册，问自己同样的问题。去做吧，这是个好主意，很有用，而且有工具可以帮助你。写作帮助扩展手册描述了可以用什么格式保存数据，以及如何遵循 R 标准。

\n

我建议在包内提供数据的是使用：

\n

devtools::use_data(...,internal=FALSE,overwrite=TRUE)\n

Run Code Online (Sandbox Code Playgroud)\n

其中...是要保存的数据集的不带引号的名称。

\n

https://www.rdocumentation.org/packages/devtools/versions/1.13.3/topics/use_data

\n

inst您只需在包的子目录中创建一个文件即可创建数据集。我自己的例子是https://github.com/cran/stacomir/blob/master/inst/config/generate_data.R

\n

例如，我用它来创建 r_mig 数据集

\n

#################################\n# generates dataset for report_mig\n# from the vertical slot fishway located at the estuary of the Vilaine (Brittany)\n# Taxa Liza Ramada (Thinlip grey mullet) in 2015\n##################################\n\n#{ here some stuff necessary to generate this dataset from my package\n# and database}\nsetwd("C:/workspace/stacomir/pkg/stacomir")\ndevtools::use_data(r_mig,internal=FALSE,overwrite=TRUE)\n

Run Code Online (Sandbox Code Playgroud)\n

这将以适当的格式保存您的数据集。使用internal = FALSE允许访问所有使用的用户data()。我建议您阅读data()帮助文件。您可以使用它data()来访问您的文件，包括当您不在包中时（前提是它们位于数据子目录中）。

\n

\n
如果 lib.loc 和 package 均为 NULL（默认值），则在所有当前加载的包中搜索数据集，然后在 \xe2\x80\x98data\xe2\x80\x99\n目录（如果有）中搜索数据集。当前工作目录。
\n

\n

如果您使用的是 Roxygen，请创建一个名为 data.R 的 R 文件，在其中存储所有数据集的描述。下面是 stacomiR 包中的一个数据集的 Roxygen 命名示例。

\n

#\' Video counting of thin lipped mullet (Liza ramada) in 2015 in the Vilaine (France)\n#\' \n#\' This dataset corresponds to the data collected at the vertical slot fishway\n#\' in 2015, video recording of the thin lipped mullet Liza ramada migration\n#\'\n#\' @format An object of class report_mig with 8 slots:\n#\' \\describe{\n#\'   \\item{dc}{the \\code{ref_dc} object with 4 slots filled with data corresponding to the iav postgres schema}\n#\'   \\item{taxa}{the \\code{ref_taxa} the taxa selected}\n#\'   \\item{stage}{the \\code{ref_stage} the stage selected}\n#\'   \\item{timestep}{the \\code{ref_timestep_daily} calculated for all 2015}\n#\'   \\item{data}{ A dataframe with 10304 rows and 11 variables\n#\'          \\describe{\n#\'              \\item{ope_identifiant}{operation id}\n#\'              \\item{lot_identifiant}{sample id}\n#\'              \\item{lot_identifiant}{sample id}\n#\'              \\item{ope_dic_identifiant}{dc id}\n#\'              \\item{lot_tax_code}{species id}\n#\'              \\item{lot_std_code}{stage id}\n#\'              \\item{value}{the value}\n#\'              \\item{type_de_quantite}{either effectif (number) or poids (weights)}\n#\'              \\item{lot_dev_code}{destination of the fishes}\n#\'              \\item{lot_methode_obtention}{method of data collection, measured, calculated...} \n#\'              }\n#\'   }\n#\'   \\item{coef_conversion}{A data frame with 0 observations : no quantity are reported for video recording of mullets, only numbers}\n#\'   \\item{time.sequence}{A time sequence generated for the report, used internally}\n#\' }\n#\' @keywords data\n"r_mig"\n

Run Code Online (Sandbox Code Playgroud)\n

完整的文件在那里：

\n

https://github.com/cran/stacomiR/blob/master/R/data.R

\n

另一个例子：阅读：http ://r-pkgs.had.co.nz/data.html#documenting-data

\n

然后您可以通过调用在如下测试中使用这些数据data("r_mig")

\n

test_that("Summary method works",\n    {\n     ... #some other code\n\n      data("r_mig")\n      r_mig<-calcule(r_mig,silent=TRUE)\n      summary(r_mig,silent=TRUE)\n      rm(list=ls(envir=envir_stacomi),envir=envir_stacomi)\n    })\n

Run Code Online (Sandbox Code Playgroud)\n

最重要的是，您可以使用手册中的内容来描述如何使用包中的函数。

\n

归档时间：	7 年，11 月前
查看次数：	1059 次
最近记录：	3 年，2 月前