我有一个包含三列的数据集。该列user有两个操作,包括action1和action2。action2仅当action1列有A数据时才包含信息。我想将P数据action1与action2. 例如,如果action2has ac,而下一行有Pin action1,我想P变成Pac,并且会继续(全部P变成Pac)直到action2变化。请注意,此过程应针对每个user.
df<-read.table(text="
user action1 action2
1 A a
1 B NA
1 P NA
1 P NA
1 A ac
1 P NA
2 B NA
2 P NA
2 A aa
2 P NA
2 AB aa",header=T)
result: (I highlighted those rows that infected)
user action1 action2
1 A a
1 B NA
1 Pa NA <-
1 Pa NA <-
1 A ac
1 Pac NA <-
2 B NA
2 P NA
2 A aa
2 Paa NA <-
2 AB NA
Run Code Online (Sandbox Code Playgroud)
谢谢
library('data.table')
library('zoo')
# Using zoo::na.locf(), fill NA with the previous value and group by user. Also `na.locf` will not remove NA.
setDT(df)[, V3 := na.locf(action2, na.rm = FALSE), by = .(user)]
# combine action1 with V3 column if action1 is equal to 'P' and it is not NA.
df[action1 == 'P' & !(is.na(V3)), action1 := paste0(action1, V3)]
df[, V3 := NULL] # remove V3 column
df
# user action1 action2
# 1: 1 A a
# 2: 1 B NA
# 3: 1 Pa NA
# 4: 1 Pa NA
# 5: 1 A ac
# 6: 1 Pac NA
# 7: 2 B NA
# 8: 2 P NA
# 9: 2 A aa
# 10: 2 Paa NA
# 11: 2 AB aa
Run Code Online (Sandbox Code Playgroud)
数据:
df<-read.table(text="
user action1 action2
1 A a
1 B NA
1 P NA
1 P NA
1 A ac
1 P NA
2 B NA
2 P NA
2 A aa
2 P NA
2 AB aa",header=T, stringsAsFactors = FALSE)
Run Code Online (Sandbox Code Playgroud)