我有两个数据帧.One(df1)包含所有感兴趣的列和行,但包含缺少的观察值.另一个(df2)包括用于代替缺失观察的值,并且仅包括至少NA存在一个的列和行df1.我想以某种方式合并两个数据集来获得desired.result.
这似乎是一个非常简单的问题需要解决,但我正在画一个空白.我无法merge上班.也许我可以写嵌套for-loops,但还没有这样做.我也试过aggregate几次.我有点害怕发布这个问题,担心我的R卡可能会被撤销.对不起,如果这是重复的.我在这里搜索并与Google进行了相当密集的搜索.谢谢你的任何建议.碱的溶液R是优选的.
df1 = read.table(text = "
county year1 year2 year3
aa 10 20 30
bb 1 NA 3
cc 5 10 NA
dd 100 NA 200
", sep = "", header = TRUE)
df2 = read.table(text = "
county year2 year3
bb 2 NA
cc NA 15
dd 150 NA
", sep = "", header = TRUE)
desired.result = read.table(text = "
county year1 year2 year3
aa 10 20 30
bb 1 2 3
cc 5 10 15
dd 100 150 200
", sep = "", header = TRUE)
Run Code Online (Sandbox Code Playgroud)
aggregate 可以做到这一点:
aggregate(. ~ county,
data=merge(df1, df2, all=TRUE), # Merged data, including NAs
na.action=na.pass, # Aggregate rows with missing values...
FUN=sum, na.rm=TRUE) # ...but instruct "sum" to ignore them.
## county year2 year3 year1
## 1 aa 20 30 10
## 2 bb 2 3 1
## 3 cc 10 15 5
## 4 dd 150 200 100
Run Code Online (Sandbox Code Playgroud)