将目录中的多个JSON文件读取到一个数据框中

Rak*_*van 5 r

library(rjson)
filenames <- list.files(pattern="*.json") # gives a character vector, with each file name represented by an entry
Run Code Online (Sandbox Code Playgroud)

现在,我想将所有JSON文件作为一个dataFrame导入到R中。我怎么做?

我第一次尝试

myJSON <- lapply(filenames, function(x) fromJSON(file=x)) # should return a list in which each element is one of the JSON files
Run Code Online (Sandbox Code Playgroud)

但是上面的代码要花一些时间才能终止,因为我有15,000个文件,而且我知道它不会返回单个数据帧。有更快的方法吗?

样本JSON文件:

 {"Reviews": [{"Ratings": {"Service": "4", "Cleanliness": "5"}, "AuthorLocation": "Boston", "Title": "\u201cExcellent Hotel & Location\u201d", "Author": "gowharr32", "ReviewID": "UR126946257", "Content": "We enjoyed the Best Western Pioneer Square....", "Date": "March 29, 2012"}, {"Ratings": {"Overall": "5"},"AuthorLocation": "Chicago",....},{...},....}]}
Run Code Online (Sandbox Code Playgroud)

Mon*_*uiz 9

对于在这里寻找 purrr / tidyverse 解决方案的任何人:

library(purrr)
library(tidyverse)
library(jsonlite)

path <- "./your_path"
files <- dir(path, pattern = "*.json")

data <- files %>%
       map_df(~fromJSON(file.path(path, .), flatten = TRUE))
Run Code Online (Sandbox Code Playgroud)


amo*_*onk 3

通过以下方式并行:

library(parallel)
cl <- makeCluster(detectCores() - 1)
json_files<-list.files(path ="your/json/path",pattern="*.json",full.names = TRUE)
json_list<-parLapply(cl,json_files,function(x) rjson::fromJSON(file=x,method = "R"))
stopCluster(cl)
Run Code Online (Sandbox Code Playgroud)