我有一个7行和4列的数据框(df)(名为c1,c2,c3,c4):
c1 c2 c3 c4
Yes No Yes No
Yes Yes No No
No Yes No No
Yes No No No
Yes No Yes No
Yes No No No
No No Yes No
Run Code Online (Sandbox Code Playgroud)
如果第1列到第4列的值等于"是",我想在名为Expected Result的数据框中添加第5列.例如,在第1行,我在第1列和第3列中有"是"参数.要填充"预期结果"列,我将连接并将Column1名称和第2列名称添加到结果中.
以下是预期的完整结果:
c1, c3
c1, c2
c2
c1
c1, c3
c1
c3
Run Code Online (Sandbox Code Playgroud)
我有以下代码行,但有些不太正确:
df$Expected_Result <- colnames(df)[apply(df,1,which(LETTERS="Unfit"))]
Run Code Online (Sandbox Code Playgroud)
一个选项使用 data.table
library(data.table)
setDT(df)[, rownum:=1:.N,]
df$Expected_result <- melt(df, "rownum")[,
toString(variable[value=="Yes"]), rownum]$V1
Run Code Online (Sandbox Code Playgroud)
我们可以循环(apply)通过行(MARGIN=1逻辑矩阵(的)df=='Yes'),转换为“数字人指数(which),获得names和paste使用的包装一起toString是paste(., collapse=', ')。我们可能还需要if/else逻辑条件来检查行中是否有any“是”值。如果不是,则应返回NA。
df$Expected_Result <- apply(df=='Yes', 1, function(x) {
if(any(x)) {
toString(names(which(x)))
}
else NA
})
Run Code Online (Sandbox Code Playgroud)
或者,另一个选择是通过指定来获取row/column索引。通过'indx'()的分组,我们将列名'df'('val')。如果缺少某些行,即没有任何“是”元素,则使用来为丢失的行创建。whicharr.ind=TRUErowindx[,1]pasteifelseNA
indx <- which(df=='Yes', arr.ind=TRUE)
val <- tapply(names(df)[indx[,2]], indx[,1], FUN=toString)
df$Expected_Result <- ifelse(seq_len(nrow(df)) %in% names(val), val, NA)
Run Code Online (Sandbox Code Playgroud)
df <- structure(list(c1 = c("Yes", "Yes", "No", "Yes", "Yes", "Yes",
"No"), c2 = c("No", "Yes", "Yes", "No", "No", "No", "No"), c3 = c("Yes",
"No", "No", "No", "Yes", "No", "Yes"), c4 = c("No", "No", "No",
"No", "No", "No", "No")), .Names = c("c1", "c2", "c3", "c4"),
class = "data.frame", row.names = c(NA, -7L))
Run Code Online (Sandbox Code Playgroud)