paz*_*zof 14 r dataset apriori arules
我正在尝试使用data()函数将数据集加载到R中.当我使用数据集名称(例如data(Titanic)或data("Titanic"))时,它工作正常.对我来说不起作用的是使用变量而不是名称来加载数据集.例如:
# This works fine:
> data(Titanic)
# This works fine as well:
> data("Titanic")
# This doesn't work:
> myvar <- Titanic
> data(myvar)
**Warning message:
In data(myvar) : data set ‘myvar’ not found**
Run Code Online (Sandbox Code Playgroud)
为什么R要查找名为"myvar"的数据集,因为它没有被引用?既然这是默认行为,是不是有办法加载存储在变量中的数据集?
为了记录,我要做的是创建一个使用"arules"包并使用Apriori挖掘关联规则的函数.因此,我需要将数据集作为参数传递给该函数.
myfun <- function(mydataset) {
data(mydataset) # doesn't work (data set 'mydataset' not found)
rules <- apriori(mydataset)
}
Run Code Online (Sandbox Code Playgroud)
编辑 - sessionInfo()的输出:
> sessionInfo()
R version 3.0.0 (2013-04-03)
Platform: i386-w64-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] arules_1.0-14 Matrix_1.0-12 lattice_0.20-15 RPostgreSQL_0.4 DBI_0.2-7
loaded via a namespace (and not attached):
[1] grid_3.0.0 tools_3.0.0
Run Code Online (Sandbox Code Playgroud)
我得到的实际错误(例如,使用样本数据集"xyz"):
xyz <- data.frame(c(1,2,3))
data(list=xyz)
Warning messages:
1: In grep(name, files, fixed = TRUE) :
argument 'pattern' has length > 1 and only the first element will be used
2: In grep(name, files, fixed = TRUE) :
argument 'pattern' has length > 1 and only the first element will be used
3: In if (name %in% names(rds)) { :
the condition has length > 1 and only the first element will be used
4: In grep(name, files, fixed = TRUE) :
argument 'pattern' has length > 1 and only the first element will be used
5: In if (name %in% names(rds)) { :
the condition has length > 1 and only the first element will be used
6: In grep(name, files, fixed = TRUE) :
argument 'pattern' has length > 1 and only the first element will be used
...
...
32: In data(list = xyz) :
c("data set ‘1’ not found", "data set ‘2’ not found", "data set ‘3’ not found")
Run Code Online (Sandbox Code Playgroud)
Aar*_*ica 15
使用list参数.见?data.
data(list=myvar)
Run Code Online (Sandbox Code Playgroud)
你还需要myvar成为一个字符串.
myvar <- "Titanic"
Run Code Online (Sandbox Code Playgroud)
请注意,myvar <- Titanic由于泰坦尼克号数据集的延迟加载,只能工作(我认为).包中的大多数数据集都是以这种方式加载的,但对于其他类型的数据集,您仍然需要该data命令.
使用变量作为字符。否则,您将处理“泰坦尼克号”的内容,而不是其名称。您可能还需要使用get才能将字符值转换为对象名称。
myvar <- 'Titanic'
myfun <- function(mydataset) {
data(list=mydataset)
str(get(mydataset))
}
myfun(myvar)
Run Code Online (Sandbox Code Playgroud)
paz*_*zof -3
我正在回答我自己的问题,但我终于找到了解决方案。引用R帮助:
\n\n“在所有当前加载的包中搜索数据集,然后在当前工作目录的 \xe2\x80\x98data\xe2\x80\x99 目录(如果有)中搜索。”
\n\n因此,我们所要做的就是将数据集写入文件中,并将其放入名为“data”并位于工作目录中的目录中。
\n\n> write.table(mydataset,file="dataset.csv",sep=",",quote=TRUE,row.names=FALSE) # I intend to create a csv file, so I use \'sep=","\' to separate the entries by a comma, \'quote=TRUE\' to quote all the entries, and \'row.names=F to prevent the creation of an extra column containing the row names (which is the default behavior of write.table() )\n\n# Now place the dataset into a "data" directory (either via R or via the operating system, doesn\'t make any difference):\n> dir.create("data") # create the directory\n> file.rename(from="dataset.csv",to="data/dataset.csv") # move the file\n\n# Now we can finally load the dataset:\n> data("mydataset") # data(mydataset) works as well, but quoted is preferable - less risk of conflict with another object coincidentally named "mydataset" as well\nRun Code Online (Sandbox Code Playgroud)\n
| 归档时间: |
|
| 查看次数: |
57979 次 |
| 最近记录: |