Jyo*_*rya 204
rbind.fill
从包中plyr
可能是你正在寻找的.
nei*_*fws 47
您可以smartbind
从gtools
包中使用.
例:
library(gtools)
df1 <- data.frame(a = c(1:5), b = c(6:10))
df2 <- data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
smartbind(df1, df2)
# result
a b c
1.1 1 6 <NA>
1.2 2 7 <NA>
1.3 3 8 <NA>
1.4 4 9 <NA>
1.5 5 10 <NA>
2.1 11 16 A
2.2 12 17 B
2.3 13 18 C
2.4 14 19 D
2.5 15 20 E
Run Code Online (Sandbox Code Playgroud)
Aar*_*ham 40
如果df1中的列是df2中的列(按列名称):
df3 <- rbind(df1, df2[, names(df1)])
Run Code Online (Sandbox Code Playgroud)
kda*_*ria 33
替代方案data.table
:
library(data.table)
df1 = data.frame(a = c(1:5), b = c(6:10))
df2 = data.frame(a = c(11:15), b = c(16:20), c = LETTERS[1:5])
rbindlist(list(df1, df2), fill = TRUE)
Run Code Online (Sandbox Code Playgroud)
rbind
data.table
只要将对象转换为data.table
对象,它也将起作用
rbind(setDT(df1), setDT(df2), fill=TRUE)
Run Code Online (Sandbox Code Playgroud)
也会在这种情况下工作.当你有几个data.tables并且不想构建一个列表时,这可能是更好的选择.
lmo*_*lmo 27
大多数基本R答案解决了只有一个data.frame有附加列或者结果data.frame将具有列的交集的情况.由于OP写道我希望保留绑定后不匹配的列,使用基本R方法解决此问题的答案可能值得发布.
下面,我介绍两种基本R方法:一种改变原始data.frames,另一种不改变原始data.frames.另外,我提供了一种方法,将非破坏性方法推广到两个以上的data.frames.
首先,让我们得到一些样本数据.
# sample data, variable c is in df1, variable d is in df2
df1 = data.frame(a=1:5, b=6:10, d=month.name[1:5])
df2 = data.frame(a=6:10, b=16:20, c = letters[8:12])
Run Code Online (Sandbox Code Playgroud)
两个data.frames,alter originals
为了保留两个data.frames中的所有列rbind
(并允许函数工作而不会导致错误),你可以为每个data.frame添加NA列,并填入适当的缺失名称使用setdiff
.
# fill in non-overlapping columns with NAs
df1[setdiff(names(df2), names(df1))] <- NA
df2[setdiff(names(df1), names(df2))] <- NA
Run Code Online (Sandbox Code Playgroud)
现在,rbind
-em
rbind(df1, df2)
a b d c
1 1 6 January <NA>
2 2 7 February <NA>
3 3 8 March <NA>
4 4 9 April <NA>
5 5 10 May <NA>
6 6 16 <NA> h
7 7 17 <NA> i
8 8 18 <NA> j
9 9 19 <NA> k
10 10 20 <NA> l
Run Code Online (Sandbox Code Playgroud)
请注意,前两行会更改原始data.frames,df1和df2,并为两者添加完整的列集.
两个data.frames,不改变原始文件
要保持原始data.frames不变,首先循环遍历不同的名称,返回一个命名的NA向量,这些NAs与data.frame使用连接成一个列表c
.然后,data.frame
将结果转换为适当的data.frame rbind
.
rbind(
data.frame(c(df1, sapply(setdiff(names(df2), names(df1)), function(x) NA))),
data.frame(c(df2, sapply(setdiff(names(df1), names(df2)), function(x) NA)))
)
Run Code Online (Sandbox Code Playgroud)
许多data.frames,不会改变原始文件
在您有两个以上data.frames的实例中,您可以执行以下操作.
# put data.frames into list (dfs named df1, df2, df3, etc)
mydflist <- mget(ls(pattern="df\\d+"))
# get all variable names
allNms <- unique(unlist(lapply(mydflist, names)))
# put em all together
do.call(rbind,
lapply(mydflist,
function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
function(y) NA)))))
Run Code Online (Sandbox Code Playgroud)
也许更好看不到原始data.frames的行名?然后这样做.
do.call(rbind,
c(lapply(mydflist,
function(x) data.frame(c(x, sapply(setdiff(allNms, names(x)),
function(y) NA)))),
make.row.names=FALSE))
Run Code Online (Sandbox Code Playgroud)
Jon*_*ang 18
你也可以拔出常用的列名.
> cols <- intersect(colnames(df1), colnames(df2))
> rbind(df1[,cols], df2[,cols])
Run Code Online (Sandbox Code Playgroud)
小智 6
我写了一个函数来做这个,因为我喜欢我的代码告诉我是否有问题.此函数将明确告诉您哪些列名称不匹配以及您是否存在类型不匹配.然后它会尽力组合data.frames.限制是您一次只能组合两个data.frames.
### combines data frames (like rbind) but by matching column names
# columns without matches in the other data frame are still combined
# but with NA in the rows corresponding to the data frame without
# the variable
# A warning is issued if there is a type mismatch between columns of
# the same name and an attempt is made to combine the columns
combineByName <- function(A,B) {
a.names <- names(A)
b.names <- names(B)
all.names <- union(a.names,b.names)
print(paste("Number of columns:",length(all.names)))
a.type <- NULL
for (i in 1:ncol(A)) {
a.type[i] <- typeof(A[,i])
}
b.type <- NULL
for (i in 1:ncol(B)) {
b.type[i] <- typeof(B[,i])
}
a_b.names <- names(A)[!names(A)%in%names(B)]
b_a.names <- names(B)[!names(B)%in%names(A)]
if (length(a_b.names)>0 | length(b_a.names)>0){
print("Columns in data frame A but not in data frame B:")
print(a_b.names)
print("Columns in data frame B but not in data frame A:")
print(b_a.names)
} else if(a.names==b.names & a.type==b.type){
C <- rbind(A,B)
return(C)
}
C <- list()
for(i in 1:length(all.names)) {
l.a <- all.names[i]%in%a.names
pos.a <- match(all.names[i],a.names)
typ.a <- a.type[pos.a]
l.b <- all.names[i]%in%b.names
pos.b <- match(all.names[i],b.names)
typ.b <- b.type[pos.b]
if(l.a & l.b) {
if(typ.a==typ.b) {
vec <- c(A[,pos.a],B[,pos.b])
} else {
warning(c("Type mismatch in variable named: ",all.names[i],"\n"))
vec <- try(c(A[,pos.a],B[,pos.b]))
}
} else if (l.a) {
vec <- c(A[,pos.a],rep(NA,nrow(B)))
} else {
vec <- c(rep(NA,nrow(A)),B[,pos.b])
}
C[[i]] <- vec
}
names(C) <- all.names
C <- as.data.frame(C)
return(C)
}
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
259543 次 |
最近记录: |