为什么函数“load”在“lapply”中不起作用，但在“for”循环中起作用？

Question

为什么函数“load”在“lapply”中不起作用，但在“for”循环中起作用？

我正在尝试将一系列文件加载到 R 中的列表中。下面是示例和我使用的代码。

## data
val <- c(1:5)
save(val, file='test1.rda')
val <- c(6:10)
save(val, file='test2.rda')

## file names
files = paste0('test',c(1:2), '.rda')
# "test1.rda" "test2.rda"

## use apply to load data into a list 
res <- lapply(files, function(x) load(x))
res
# [[1]]
# [1] "val" # ??? supposed to be 1,2,3,4,5
# 
# [[2]]
# [1] "val" # ??? supposed to be 6,7,8,9,10


## use for loops to load data
for (i in c(1:2)){
  load(files[i])
}
# data sets are loaded as expected

Run Code Online (Sandbox Code Playgroud)

我不明白为什么该apply + load函数没有返回正确的列表。如果有人能指出我正确的方向，我将不胜感激。

Answer 1

r2e*_*ans 6

前面的底线：load将数据加载到调用环境中，这在从循环运行for和从lapply. 您可以覆盖此设置以强制将数据加载到哪个环境中。

\n

如果你读过?load，你会看到这个envir=论点：

\n

Usage:\n\n     load(file, envir = parent.frame(), verbose = FALSE)\n     \nArguments:\n\n    file: a (readable binary-mode) connection or a character string\n          giving the name of the file to load (when tilde expansion is\n          done).\n\n   envir: the environment where the data should be loaded.\n\n verbose: should item names be printed during loading?\n

Run Code Online (Sandbox Code Playgroud)\n

由于默认值为parent.frame()，这意味着它被加载到中定义的环境中lapply，而不是全局环境中。

\n

示范：

\n

Usage:\n\n     load(file, envir = parent.frame(), verbose = FALSE)\n     \nArguments:\n\n    file: a (readable binary-mode) connection or a character string\n          giving the name of the file to load (when tilde expansion is\n          done).\n\n   envir: the environment where the data should be loaded.\n\n verbose: should item names be printed during loading?\n

Run Code Online (Sandbox Code Playgroud)\n

另外，自从

\n

Value:\n\n     A character vector of the names of objects created, invisibly.\n

Run Code Online (Sandbox Code Playgroud)\n

这意味着res <- lapply(files, load)始终只返回character向量，而不返回值本身。

\n

虽然我同意 Samet S\xc3\xb6kel\ 的前提，即readRDS提供更实用的接口（意思是：它返回一些东西，它不仅仅在副作用上运行），但解决方法并不太困难：

\n

加载到全局环境中：

\n

for (i in 1:2) { print(environment()); }\n# <environment: R_GlobalEnv>\n# <environment: R_GlobalEnv>\nign <- lapply(1:2, function(ign) print(environment()))\n# [[1]]\n# <environment: 0x000000006f54b838>                # not R_GlobalEnv, aka .GlobalEnv\n# [[2]]\n# <environment: 0x000000006f54de58>\n

Run Code Online (Sandbox Code Playgroud)\n

这将返回加载到的所有变量的名称res以及出现在全局环境中的所有数据。

\n

加载到用户定义的环境中：
\n
```
Value:\n\n     A character vector of the names of objects created, invisibly.\n
```
Run Code Online (Sandbox Code Playgroud)\n
reswill 也只包含名称，但这更接近于函数式接口，因为数据将进入您定义的非常特定的位置。
\n
不要很快忽略这一点：如果您选择“生产”加载所有文件的代码.rda，那么将数据加载到.GlobalEnv. 其一，在函数内部加载并将数据放入全局中确实是很糟糕的做法，并且它可能并不总是适合您的函数顺利工作。好吧，这只是“一个”，生产型函数/包中的副作用是一件坏事（imo）：它经常破坏可重复性，它真的会让那些碰巧在其中具有同名变量的用户感到困惑。他们的环境……覆盖它们是一种不可逆转的操作，可能很快导致愤怒和生产力损失。当出现问题时，副作用也很难排除。
\n

\n

问题提出后仅16分钟。我是你的超级粉丝。 (2认同)

归档时间：	4 年，6 月前
查看次数：	467 次
最近记录：	4 年，6 月前