如何将JSON导入R并将其转换为表?

Aid*_*dis 5 json dictionary loops r rjson

我想玩现在以JSON格式保存的数据.但我对R来说很新,并且对如何使用数据几乎没有任何线索.你可以在下面看到我设法实现的目标.但首先,我的代码:

library(rjson)
json_file <- "C:\\Users\\Saonkfas\\Desktop\\WOWPAPI\\wowpfinaljson.json"
json_data <- fromJSON(paste(readLines(json_file), collapse=""))
Run Code Online (Sandbox Code Playgroud)

我能够获得数据:

for (x in json_data){print (x)}
Run Code Online (Sandbox Code Playgroud)

虽然输出看起来很原始:

[[1]]
[[1]]$wins
[1] "118"

[[1]]$losses
[1] "40"
# And so on
Run Code Online (Sandbox Code Playgroud)

请注意,JSON有点嵌套.我可以使用Python创建表,但R看起来要复杂得多.

编辑:

我的JSON:

{
"play1": [
    {
        "wins": "118",
        "losses": "40",
        "max_killed": "7",
        "battles": "158",
        "plane_id": "4401",
        "max_ground_object_destroyed": "3"
    },
    {
        "wins": "100",
        "losses": "58",
        "max_killed": "7",
        "battles": "158",
        "plane_id": "2401",
        "max_ground_object_destroyed": "3"
    },
    {
        "wins": "120",
        "losses": "38",
        "max_killed": "7",
        "battles": "158",
        "plane_id": "2403",
        "max_ground_object_destroyed": "3"
    }
],

"play2": [
    {
        "wins": "12",
        "losses": "450",
        "max_killed": "7",
        "battles": "158",
        "plane_id": "4401",
        "max_ground_object_destroyed": "3"
    },
    {
        "wins": "150",
        "losses": "8",
        "max_killed": "7",
        "battles": "158",
        "plane_id": "2401",
        "max_ground_object_destroyed": "3"
    },
    {
        "wins": "120",
        "losses": "328",
        "max_killed": "7",
        "battles": "158",
        "plane_id": "2403",
        "max_ground_object_destroyed": "3"
    }
],
Run Code Online (Sandbox Code Playgroud)

nic*_*ico 12

fromJSON返回一个列表,您可以使用这些*apply函数来遍历每个元素.它是相当简单的(一旦你知道该怎么做!)将它转换为"表"(数据框是正确的R术语).

library(rjson)

# You can pass directly the filename
my.JSON <- fromJSON(file="test.json")

df <- lapply(my.JSON, function(play) # Loop through each "play"
  {
  # Convert each group to a data frame.
  # This assumes you have 6 elements each time
  data.frame(matrix(unlist(play), ncol=6, byrow=T))
  })

# Now you have a list of data frames, connect them together in
# one single dataframe
df <- do.call(rbind, df)

# Make column names nicer, remove row names
colnames(df) <- names(my.JSON[[1]][[1]])
rownames(df) <- NULL

df
  wins losses max_killed battles plane_id max_ground_object_destroyed
1  118     40          7     158     4401                           3
2  100     58          7     158     2401                           3
3  120     38          7     158     2403                           3
4   12    450          7     158     4401                           3
5  150      8          7     158     2401                           3
6  120    328          7     158     2403                           3
Run Code Online (Sandbox Code Playgroud)


Eri*_*ric 9

我觉得jsonlite这个任务对用户更友好了.这是三个JSON解析包的比较(偏向于jsonlite)

library(jsonlite)
data <- fromJSON('path/to/file.json')

data
#> $play1
#   wins losses max_killed battles plane_id max_ground_object_destroyed
# 1  118     40          7     158     4401                           3
# 2  100     58          7     158     2401                           3
# 3  120     38          7     158     2403                           3
# 
# $play2
#   wins losses max_killed battles plane_id max_ground_object_destroyed
# 1   12    450          7     158     4401                           3
# 2  150      8          7     158     2401                           3
# 3  120    328          7     158     2403                           3
Run Code Online (Sandbox Code Playgroud)

如果要将这些列表名称折叠为新列,我建议dplyr::bind_rows而不是do.call(rbind, data)

library(dplyr)
data <- bind_rows(data, .id = 'play')

# Source: local data frame [6 x 7]

#    play  wins losses max_killed battles plane_id max_ground_object_destroyed
#   (chr) (chr)  (chr)      (chr)   (chr)    (chr)                       (chr)
# 1 play1   118     40          7     158     4401                           3
# 2 play1   100     58          7     158     2401                           3
# 3 play1   120     38          7     158     2403                           3
# 4 play2    12    450          7     158     4401                           3
# 5 play2   150      8          7     158     2401                           3
# 6 play2   120    328          7     158     2403                           3
Run Code Online (Sandbox Code Playgroud)

请注意,列可能没有您期望的类型(请注意列是所有字符,因为所提供的JSON数据中都引用了所有数字)!

编辑2017年11月:类型转换的一种方法是mutate_if用来猜测字符列的预期类型.

data <- mutate_if(data, is.character, type.convert, as.is = TRUE)
Run Code Online (Sandbox Code Playgroud)