我有一个配对数据的数据集(同一家庭的成员).
Id是个人标识符,householdid是合作伙伴的标识符(反之亦然).
我需要的是为他/她的伴侣的每个id添加一个额外的列(职业).
我的数据看起来像这样
dta = rbind( c(1013661,101366, 'Never worked'),
c(1013662, 101366, 'Intermediate occs'),
c(1037552, 103755, 'Managerial & professional occs'),
c(1037551, 103755, 'Intermediate occs')
)
colnames(dta) = c('idno', 'householdid', 'occup')
dta
idno householdid occup
"1013661" "101366" "Never worked"
"1013662" "101366" "Intermediate occs"
"1037552" "103755" "Managerial & professional occs"
"1037551" "103755" "Intermediate occs"
Run Code Online (Sandbox Code Playgroud)
我需要的应该是这样的
idno householdid occup occupPartner
"1013661" "101366" "Never worked" "Intermediate occs"
"1013662" "101366" "Intermediate occs" "Never worked"
"1037552" "103755" "Managerial & professional occs" "Intermediate occs"
"1037551" "103755" "Intermediate occs" "Managerial & professional occs"
Run Code Online (Sandbox Code Playgroud)
我猜有一个mutate的解决方案,但我不确定group_by应该是什么.
有任何想法吗 ?
尝试
library(dplyr)
dta1 <- as.data.frame(dta) %>%
group_by(householdid) %>%
mutate(occupPartner= rev(occup))
as.data.frame(dta1)
# idno householdid occup
#1 1013661 101366 Never worked
#2 1013662 101366 Intermediate occs
#3 1037552 103755 Managerial & professional occs
#4 1037551 103755 Intermediate occs
# occupPartner
#1 Intermediate occs
#2 Never worked
#3 Intermediate occs
#4 Managerial & professional occs
Run Code Online (Sandbox Code Playgroud)
如果数据已经订购,
indx <- c(rbind(seq(2, nrow(dta), by=2), seq(1, nrow(dta), by=2)))
cbind(dta, occupPartner=dta[,3][indx])
Run Code Online (Sandbox Code Playgroud)