如何根据R中其他变量的类别汇总值?

san*_*j00 4 r dataframe

我有一个数据集,显示了X国甲方和乙方的宗教信仰,以及每个国家宗教信徒的百分比。

df <- data.frame(
  PartyA = c("Christian","Muslim","Muslim","Jewish","Sikh"),
  PartyB = c("Jewish","Muslim","Christian","Muslim","Buddhist"),
  ChristianPop = c(12,1,74,14,17),
  MuslimPop = c(71,93,5,86,13),
  JewishPop = c(9,2,12,0,4),
  SikhPop = c(0,0,1,0,10),
  BuddhistPop = c(1,0,2,0,45)
)
#      PartyA    PartyB ChristianPop MuslimPop JewishPop SikhPop BuddhistPop
# 1 Christian    Jewish           12        71         9       0           1
# 2    Muslim    Muslim            1        93         2       0           0
# 3    Muslim Christian           74         5        12       1           2
# 4    Jewish    Muslim           14        86         0       0           0
# 5      Sikh  Buddhist           17        13         4      10          45
Run Code Online (Sandbox Code Playgroud)

借此,我想将“参与”的宗教信徒的总数加在一起。因此,第一行将得到一个等于 12 + 9 的变量,第二行只有 93(没有添加,因为甲方和乙方相同),等等。

#      PartyA    PartyB ChristianPop MuslimPop JewishPop SikhPop BuddhistPop PartyRel
# 1 Christian    Jewish           12        71         9       0           1       21
# 2    Muslim    Muslim            1        93         2       0           0       93
# 3    Muslim Christian           74         5        12       1           2       79
# 4    Jewish    Muslim           14        86         0       0           0       86
# 5      Sikh  Buddhist           17        13         4      10          45       55
Run Code Online (Sandbox Code Playgroud)

我什至很难找到从哪里开始,非常感谢您的帮助。

ben*_*n23 7

We can iterate through rows with sapply, then paste the string "Pop" to your Party columns for indexing and summation.

df$PartyRel <- sapply(
  1:nrow(df), 
  \(x) ifelse(df[x, 1] == df[x, 2], 
              df[x, paste0(df[x, 1], "Pop")], 
              df[x, paste0(df[x, 1], "Pop")] + df[x, paste0(df[x, 2], "Pop")])
  )
Run Code Online (Sandbox Code Playgroud)

Similar idea to my above base R solution, but this employs map2 from the purrr package in tidyverse style.

library(tidyverse)

df %>% 
  rowwise() %>% 
  mutate(PartyRel = map2_int(PartyA, PartyB,
                             ~ifelse(.x == .y, 
                                     get(paste0(.x, "Pop")), 
                                     get(paste0(.x, "Pop")) + get(paste0(.y, "Pop"))))) %>% 
  ungroup()
Run Code Online (Sandbox Code Playgroud)

Output

Both of the above give the same result:

df
     PartyA    PartyB ChristianPop MuslimPop JewishPop SikhPop BuddhistPop PartyRel
1 Christian    Jewish           12        71         9       0           1       21
2    Muslim    Muslim            1        93         2       0           0       93
3    Muslim Christian           74         5        12       1           2       79
4    Jewish    Muslim           14        86         0       0           0       86
5      Sikh  Buddhist           17        13         4      10          45       55
Run Code Online (Sandbox Code Playgroud)