合并数据帧列表中的数据帧

ano*_*ous 5 r dataframe

我有一个数据框列表,如下所示:

ls[[1]]
[[1]]

 month year   oracle
    1 2004 356.0000
    2 2004 390.0000
    3 2004 394.4286
    4 2004 391.8571 
 ls[[2]]
 [[2]]
 month year microsoft
    1 2004  339.0000
    2 2004  357.7143
    3 2004  347.1429
    4 2004  333.2857
Run Code Online (Sandbox Code Playgroud)

如何创建如下所示的单个数据框:

 month year   oracle   microsoft
    1 2004 356.0000    339.0000
    2 2004 390.0000    357.7143
    3 2004 394.4286    347.1429
    4 2004 391.8571    333.2857
Run Code Online (Sandbox Code Playgroud)

akr*_*run 7

我们也可以使用 Reduce

Reduce(function(...) merge(..., by = c('month', 'year')), lst)
Run Code Online (Sandbox Code Playgroud)

使用@ Jaap的示例,如果值不相同,请使用all=TRUE选项from merge.

Reduce(function(...) merge(..., by = c('month', 'year'), all=TRUE), ls)
#     month year   oracle microsoft   google
#1     1 2004 356.0000        NA       NA
#2     2 2004 390.0000  339.0000       NA
#3     3 2004 394.4286  357.7143 390.0000
#4     4 2004 391.8571  347.1429 391.8571
#5     5 2004       NA  333.2857 357.7143
#6     6 2004       NA        NA 333.2857
Run Code Online (Sandbox Code Playgroud)


Jaa*_*aap 5

如果每个数据帧的和列的值相同,则使用@akrun 的答案中的Reduce/代码会非常有效。但是,当它们不相同时(本答案末尾的示例数据)mergemonthyear

Reduce(function(...) merge(..., by = c('month', 'year')), ls)
Run Code Online (Sandbox Code Playgroud)

将仅返回每个数据帧中常见的行:

  month year   oracle microsoft   google
1     3 2004 394.4286  357.7143 390.0000
2     4 2004 391.8571  347.1429 391.8571
Run Code Online (Sandbox Code Playgroud)

在这种情况下,当您想要包含所有行/观察结果时,您可以使用all=TRUE(如 @akrun 所示)或使用 full_join包中的dplyr替代方案:

library(dplyr)
Reduce(function(...) full_join(..., by = c('month', 'year')), ls) 
# or just:
Reduce(full_join, ls)
Run Code Online (Sandbox Code Playgroud)

这将导致:

  month year   oracle microsoft   google
1     1 2004 356.0000        NA       NA
2     2 2004 390.0000  339.0000       NA
3     3 2004 394.4286  357.7143 390.0000
4     4 2004 391.8571  347.1429 391.8571
5     5 2004       NA  333.2857 357.7143
6     6 2004       NA        NA 333.2857
Run Code Online (Sandbox Code Playgroud)

使用数据:

ls <- list(structure(list(month = 1:4, year = c(2004L, 2004L, 2004L, 2004L), oracle = c(356, 390, 394.4286, 391.8571)), .Names = c("month", "year", "oracle"), class = "data.frame", row.names = c(NA, -4L)), 
           structure(list(month = 2:5, year = c(2004L, 2004L, 2004L, 2004L), microsoft = c(339, 357.7143, 347.1429, 333.2857)), .Names = c("month", "year", "microsoft"), class = "data.frame", row.names = c(NA,-4L)),
           structure(list(month = 3:6, year = c(2004L, 2004L, 2004L, 2004L), google = c(390, 391.8571, 357.7143, 333.2857)), .Names = c("month", "year", "google"), class = "data.frame", row.names = c(NA,-4L)))
Run Code Online (Sandbox Code Playgroud)