我想获取(在 data.table 的新列中)包含 data.frame 中仅几列中的最大值的列的列名。
这是一个示例 data.frame
# creating the vectors then the data frame ------
id = c("a", "b", "c", "d")
ignore = c(1000,1000, 1000, 1000)
s1 = c(0,0,0,100)
s2 = c(100,0,0,0)
s3 = c(0,0,50,0)
s4 = c(50,0,50,0)
df1 <- data.frame(id,ignore,s1,s2,s3,s4)
Run Code Online (Sandbox Code Playgroud)
(1) 现在我想从 s1-s4 列中找到每行中最大数字的列名。(即忽略名为“忽略”的列)
(2) 如果最大值并列,我希望返回最后一个(例如 s4)列名。
(3) 作为一个额外的好处 - 如果都是 0,我希望 NA 返回
这是我迄今为止最好的尝试
df2 <- cbind(df1,do.call(rbind,apply(df1,1,function(x) {data.frame(max.col.name=names(df1)[which.max(x)],stringsAsFactors=FALSE)})))
Run Code Online (Sandbox Code Playgroud)
这在每种情况下都会返回忽略,并且(b 行除外)如果我删除此列,并将 s1-s4 列重新排序为 s4-s1,则有效。
你会如何处理这个问题?
确实非常感谢。
我想计算每个网格方块过去三天的降雨量,并将其添加为我的data.table中的新列.为了清楚起见,我想总结一下当前和上一天的降雨量,对于每个气象网格广场
library ( zoo )
library (data.table)
# making the data.table
rain <- c(NA, NA, NA, 0, 0, 5, 1, 0, 3, 10) # rainfall values to work with
square <- c(1,1,1,1,1,1,1,1,1,2) # the geographic grid square for the rainfall measurement
desired_result <- c(NA, NA, NA, NA, NA, 5, 6, 6, 4, NA ) # this is the result I'm looking for (the last NA as we are now on to the first day of the second grid square)
weather …Run Code Online (Sandbox Code Playgroud) 我希望您能想到一种更优雅的方式来计算出前几天发生的事件数量。我的代码(如下)可以工作,但是不是很好,也不是可伸缩的。我正在尝试到达底部的表(desired_table)。有什么想法吗?
我想以比这更优雅的方式来计算前几天的事件总数。
require(data.table)
# simulating an example data.table
date <- c("2000-01-01", "2000-01-04", "2000-01-05", "2000-01-06", "2000-01-01", "2000-01-02", "2000-01-03", "2000-01-04", "2000-01-05", "2000-01-06" , "2000-01-01", "2000-01-04", "2000-01-05", "2000-01-06", "2000-01-01", "2000-01-02", "2000-01-03", "2000-01-04", "2000-01-05", "2000-01-06")
cohort <- c("a", "b", "c")
zz <- data.table(DATE = date, COHORT = cohort)
zz$DATE <- as.Date(zz$DATE) # making sure the date is in the correct format
# adding on some other date fields so we can summarise by these days as well
zz$d1 <- zz$DATE +1 # will become …Run Code Online (Sandbox Code Playgroud)