我有一个类似这里描述的问题,但没有一个我尝试过的解决方案.
给出这样的表:
Date Exercise Category Weight Reps EstMax RepxWeight Note
4/2/16 Deadlift Legs 135 7 166.4685 7x135 easy
4/2/16 Deadlift Legs 135 7 166.4685 7x135 kinda easy
4/2/16 Deadlift Legs 135 7 166.4685 7x135 tired
4/2/16 Bench Press Chest 95 5 110.8175 5x95 hard
4/2/16 Bench Press Chest 135 2 143.991 2x135 not hard
4/9/16 Bench Press Chest 135 2 143.991 2x135 a little hard
4/9/16 Bench Press Chest 135 2 143.991 2x135 super tired
4/18/16 Deadlift Legs 155 8 196.292 8x155 …
4/18/16 Deadlift Legs 155 5 180.8075 5x155 bad day
5/8/16 Deadlift Legs 185 3 203.4815 3x185 good day
5/8/16 Deadlift Legs 185 3 203.4815 3x185 felt easy
5/8/16 Bench Press Chest 115 4 130.318 4x115 easy
5/8/16 Bench Press Chest 115 4 130.318 4x115 hard
Run Code Online (Sandbox Code Playgroud)
我想基于多个其他列(例如和)获取具有某个列(例如)aggregate的max值的行,但是还要保留行中的所有其他列.并且在具有相同最大值的多个条目的情况下,取第一个条目.EstMaxDateExercise
预期的输出看起来像这样:
Date Exercise Category Weight Reps EstMax RepxWeight Note
4/2/16 Deadlift Legs 135 7 166.4685 7x135 easy
4/2/16 Bench Press Chest 135 2 143.991 2x135 not hard
4/9/16 Bench Press Chest 135 2 143.991 2x135 a little hard
4/18/16 Deadlift Legs 155 8 196.292 8x155 …
5/8/16 Deadlift Legs 185 3 203.4815 3x185 good day
5/8/16 Bench Press Chest 115 4 130.318 4x115 hard
Run Code Online (Sandbox Code Playgroud)
我试过的一些方法的例子; 在每种情况下,'额外列'最终都被用作聚合的因素,这不是我想要的.
data <- structure(list(Date = structure(c(2L, 2L, 2L, 2L, 2L, 3L, 3L,
1L, 1L, 4L, 4L, 4L, 4L), .Label = c("4/18/16", "4/2/16", "4/9/16",
"5/8/16"), class = "factor"), Exercise = structure(c(2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L), .Label = c("Bench Press",
"Deadlift"), class = "factor"), Category = structure(c(2L, 2L,
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L), .Label = c("Chest",
"Legs"), class = "factor"), Weight = c(135L, 135L, 135L, 95L,
135L, 135L, 135L, 155L, 155L, 185L, 185L, 115L, 115L), Reps = c(7L,
7L, 7L, 5L, 2L, 2L, 2L, 8L, 5L, 3L, 3L, 4L, 4L), EstMax = c(166.4685,
166.4685, 166.4685, 110.8175, 143.991, 143.991, 143.991, 196.292,
180.8075, 203.4815, 203.4815, 130.318, 130.318), RepxWeight = structure(c(6L,
6L, 6L, 5L, 1L, 1L, 1L, 7L, 4L, 2L, 2L, 3L, 3L), .Label = c("2x135",
"3x185", "4x115", "5x155", "5x95", "7x135", "8x155"), class = "factor"),
Note = structure(c(4L, 8L, 11L, 7L, 9L, 2L, 10L, 1L, 3L,
6L, 5L, 4L, 7L), .Label = c("…", "a little hard", "bad day",
"easy", "felt easy", "good day", "hard", "kinda easy", "not hard",
"super tired", "tired"), class = "factor")), .Names = c("Date",
"Exercise", "Category", "Weight", "Reps", "EstMax", "RepxWeight",
"Note"), class = "data.frame", row.names = c(NA, -13L))
# base R
aggregate(EstMax ~ Date + Exercise, data = data, FUN = max)
# Date Exercise EstMax
# 1 4/2/16 Bench Press 143.9910
# 2 4/9/16 Bench Press 143.9910
# 3 5/8/16 Bench Press 130.3180
# 4 4/18/16 Deadlift 196.2920
# 5 4/2/16 Deadlift 166.4685
# 6 5/8/16 Deadlift 203.4815
aggregate(EstMax ~ Date + Exercise + RepxWeight + Note, data = data, FUN = max)
# Date Exercise RepxWeight Note EstMax
# 1 4/18/16 Deadlift 8x155 … 196.2920
# 2 4/9/16 Bench Press 2x135 a little hard 143.9910
# 3 4/18/16 Deadlift 5x155 bad day 180.8075
# 4 5/8/16 Bench Press 4x115 easy 130.3180
# 5 4/2/16 Deadlift 7x135 easy 166.4685
# 6 5/8/16 Deadlift 3x185 felt easy 203.4815
# 7 5/8/16 Deadlift 3x185 good day 203.4815
# 8 5/8/16 Bench Press 4x115 hard 130.3180
# 9 4/2/16 Bench Press 5x95 hard 110.8175
# 10 4/2/16 Deadlift 7x135 kinda easy 166.4685
# 11 4/2/16 Bench Press 2x135 not hard 143.9910
# 12 4/9/16 Bench Press 2x135 super tired 143.9910
# 13 4/2/16 Deadlift 7x135 tired 166.4685
# data table
library("data.table")
data_dt <- data.table(data)
data_dt[ , max(EstMax), by = c("Date", "Exercise")]
# Date Exercise V1
# 1: 4/2/16 Deadlift 166.4685
# 2: 4/2/16 Bench Press 143.9910
# 3: 4/9/16 Bench Press 143.9910
# 4: 4/18/16 Deadlift 196.2920
# 5: 5/8/16 Deadlift 203.4815
# 6: 5/8/16 Bench Press 130.3180
data_dt[, max(EstMax), .(Date, Exercise, Weight, Reps, RepxWeight, Note)]
# Date Exercise Weight Reps RepxWeight Note V1
# 1: 4/2/16 Deadlift 135 7 7x135 easy 166.4685
# 2: 4/2/16 Deadlift 135 7 7x135 kinda easy 166.4685
# 3: 4/2/16 Deadlift 135 7 7x135 tired 166.4685
# 4: 4/2/16 Bench Press 95 5 5x95 hard 110.8175
# 5: 4/2/16 Bench Press 135 2 2x135 not hard 143.9910
# 6: 4/9/16 Bench Press 135 2 2x135 a little hard 143.9910
# 7: 4/9/16 Bench Press 135 2 2x135 super tired 143.9910
# 8: 4/18/16 Deadlift 155 8 8x155 … 196.2920
# 9: 4/18/16 Deadlift 155 5 5x155 bad day 180.8075
# 10: 5/8/16 Deadlift 185 3 3x185 good day 203.4815
# 11: 5/8/16 Deadlift 185 3 3x185 felt easy 203.4815
# 12: 5/8/16 Bench Press 115 4 4x115 easy 130.3180
# 13: 5/8/16 Bench Press 115 4 4x115 hard 130.3180
Run Code Online (Sandbox Code Playgroud)
特别喜欢碱R溶液.还看到了which.max()可能有用的功能,但无法弄清楚如何将其应用于此.
我看过的其他相关问题却没有解决这个问题:
我知道您寻求基础R解决方案,但与此同时,这是dplyr一个:
library(dplyr)
data %>%
group_by(Date, Exercise) %>%
slice(which.max(EstMax))
# # A tibble: 6 x 8
# # Groups: Date, Exercise [6]
# Date Exercise Category Weight Reps EstMax RepxWeight Note
# <fctr> <fctr> <fctr> <int> <int> <dbl> <fctr> <fctr>
# 1 4/18/16 Deadlift Legs 155 8 196.2920 8x155 …
# 2 4/2/16 Bench Press Chest 135 2 143.9910 2x135 not hard
# 3 4/2/16 Deadlift Legs 135 7 166.4685 7x135 easy
# 4 4/9/16 Bench Press Chest 135 2 143.9910 2x135 a little hard
# 5 5/8/16 Bench Press Chest 115 4 130.3180 4x115 easy
# 6 5/8/16 Deadlift Legs 185 3 203.4815 3x185 good day
Run Code Online (Sandbox Code Playgroud)
编辑
data.table不是我的强项,但为了完整起见,这是我的尝试:
library(data.table)
setDT(data)[, .SD[which.max(EstMax)], by = .(Date, Exercise)]
# Date Exercise Category Weight Reps EstMax RepxWeight Note
# 1: 4/2/16 Deadlift Legs 135 7 166.4685 7x135 easy
# 2: 4/2/16 Bench Press Chest 135 2 143.9910 2x135 not hard
# 3: 4/9/16 Bench Press Chest 135 2 143.9910 2x135 a little hard
# 4: 4/18/16 Deadlift Legs 155 8 196.2920 8x155 …
# 5: 5/8/16 Deadlift Legs 185 3 203.4815 3x185 good day
# 6: 5/8/16 Bench Press Chest 115 4 130.3180 4x115 easy
Run Code Online (Sandbox Code Playgroud)