Ian*_*.T 5 r dplyr tidyr tidyverse data-wrangling
我正在努力寻找一个简单快速的解决方案,通过将它们之间的所有“组”列相乘来创建一个新的数据框。
数据为例
a1 <- rnorm(n = 10)
b1 <- rnorm(n = 10)
c1 <- rnorm(n = 10)
a2 <- rnorm(n = 10)
b2 <- rnorm(n = 10)
c2 <- rnorm(n = 10)
Run Code Online (Sandbox Code Playgroud)
例如在我的初始数据表中
Original <- data.frame(
date = seq(today()-9, today(), by = 1),
a1 = a1,
b1 = b1,
c1 = c1,
a2 = a2,
b2 = b2,
c2 = c2)
Run Code Online (Sandbox Code Playgroud)
这个数据表是我想要实现的(即,以 1 结尾的列之间的所有可能组合的列和以 2 结尾的列之间的所有可能组合的列)
Objective <- data.frame(
date = seq(today()-9, today(), by = 1),
b1a1 = b1*a1,
c1a1 = c1*a1,
c1b1 = c1*b1,
b2c2 = b2*c2,
b2a2 = b2*a2,
c2a2 = c2*a2)
Run Code Online (Sandbox Code Playgroud)
我尝试使用循环,但这不是一个非常优雅和有效的解决方案;或者至少我的不是。非常欢迎使用 tidyverse 的解决方案
提前致谢
它
非常好的问题。一种tidyverse方法。这种方法将每组的列数奇数组合。解释 -
pivot_longergsub。str_replace你也可以用。dplyr::group_splittidyr::pivot_wider使用inside purrr::mapnow将所有项目中的数据重新整形回其原始形式combn和Reduce。您也可以purrr::reduce在这里使用combn其次使用相同和生成的新列的名称Reducepurrr::reduce与dplyr::left_join列表结合使用将转换回预期形状set.seed(123)
a1 <- rnorm(n = 10)
b1 <- rnorm(n = 10)
c1 <- rnorm(n = 10)
a2 <- rnorm(n = 10)
b2 <- rnorm(n = 10)
c2 <- rnorm(n = 10)
Original <- data.frame(
date = seq(Sys.Date()-9, Sys.Date(), by = 1),
a1 = a1,
b1 = b1,
c1 = c1,
a2 = a2,
b2 = b2,
c2 = c2)
library(tidyverse)
Original %>% pivot_longer(!date) %>%
mutate(grp = gsub('^\\D*(\\d)+$', '\\1', name)) %>%
group_split(grp, .keep = F) %>%
map(~ .x %>% pivot_wider(names_from = name, values_from = value)) %>%
map(~ combn(.x[-1], 2, FUN = Reduce, f = `*`) %>% as.data.frame() %>%
setNames(combn(names(.x[-1]), 2, FUN = Reduce, f = paste0)) %>% cbind(.x[1], .)) %>%
reduce(~left_join(.x, .y, by = 'date'))
date a1b1 a1c1 b1c1 a2b2 a2c2 b2c2
1 2021-05-28 -0.68606804 0.59848918 -1.30710356 -0.29626767 0.108031283 -0.175982140
2 2021-05-29 -0.08282104 0.05017292 -0.07843039 0.06135046 0.008423333 0.005935364
3 2021-05-30 0.62468579 -1.59924166 -0.41119329 -1.13268875 -0.038374446 0.054248120
4 2021-05-31 0.00780406 -0.05139295 -0.08067566 1.90463287 1.201815497 2.968438088
5 2021-06-01 -0.07186344 -0.08080991 0.34742254 0.99243873 -0.185489171 -0.272722771
6 2021-06-02 3.06467216 -2.89278864 -3.01397443 -0.77341778 1.044302702 -1.703161152
7 2021-06-03 0.22946735 0.38614963 0.41709268 -0.22316502 -0.857881519 0.623969018
8 2021-06-04 2.48789113 -0.19402639 -0.30162620 0.02889143 -0.036194437 -0.272813136
9 2021-06-05 -0.48172830 0.78173260 -0.79823906 -0.23864021 -0.037894774 0.096601990
10 2021-06-06 0.21070515 -0.55877763 -0.59279292 0.03171951 -0.082159505 -0.018002847
Run Code Online (Sandbox Code Playgroud)
检查此扩展数据集
set.seed(123)
a1 <- rnorm(n = 10)
b1 <- rnorm(n = 10)
c1 <- rnorm(n = 10)
a2 <- rnorm(n = 10)
b2 <- rnorm(n = 10)
c2 <- rnorm(n = 10)
d2 <- rnorm(n = 10)
Original <- data.frame(
date = seq(Sys.Date()-9, Sys.Date(), by = 1),
a1 = a1,
b1 = b1,
c1 = c1,
a2 = a2,
b2 = b2,
c2 = c2,
d2 = d2)
library(tidyverse)
Original %>% pivot_longer(!date) %>%
mutate(grp = gsub('^\\D*(\\d)+$', '\\1', name)) %>%
group_split(grp, .keep = F) %>%
map(~ .x %>% pivot_wider(names_from = name, values_from = value)) %>%
map(~ combn(.x[-1], 2, FUN = Reduce, f = `*`) %>% as.data.frame() %>%
setNames(combn(names(.x[-1]), 2, FUN = Reduce, f = paste0)) %>% cbind(.x[1], .)) %>%
reduce(~left_join(.x, .y, by = 'date'))
date a1b1 a1c1 b1c1 a2b2 a2c2 a2d2 b2c2 b2d2 c2d2
1 2021-05-28 -0.68606804 0.59848918 -1.30710356 -0.29626767 0.108031283 0.161902656 -0.175982140 -0.26373820 0.09616971
2 2021-05-29 -0.08282104 0.05017292 -0.07843039 0.06135046 0.008423333 0.148221326 0.005935364 0.10444173 0.01433970
3 2021-05-30 0.62468579 -1.59924166 -0.41119329 -1.13268875 -0.038374446 -0.298262480 0.054248120 0.42163941 0.01428475
4 2021-05-31 0.00780406 -0.05139295 -0.08067566 1.90463287 1.201815497 -0.894445153 2.968438088 -2.20924515 -1.39402460
5 2021-06-01 -0.07186344 -0.08080991 0.34742254 0.99243873 -0.185489171 -0.880563395 -0.272722771 -1.29468307 0.24197936
6 2021-06-02 3.06467216 -2.89278864 -3.01397443 -0.77341778 1.044302702 0.209022041 -1.703161152 -0.34089562 0.46029226
7 2021-06-03 0.22946735 0.38614963 0.41709268 -0.22316502 -0.857881519 0.248271309 0.623969018 -0.18057692 -0.69416615
8 2021-06-04 2.48789113 -0.19402639 -0.30162620 0.02889143 -0.036194437 -0.003281582 -0.272813136 -0.02473471 0.03098700
9 2021-06-05 -0.48172830 0.78173260 -0.79823906 -0.23864021 -0.037894774 -0.282179411 0.096601990 0.71933645 0.11422674
10 2021-06-06 0.21070515 -0.55877763 -0.59279292 0.03171951 -0.082159505 -0.779997773 -0.018002847 -0.17091365 0.44269850
Run Code Online (Sandbox Code Playgroud)
由reprex 包(v2.0.0)创建于 2021-06-06
| 归档时间: |
|
| 查看次数: |
114 次 |
| 最近记录: |