pan*_*tts 8 merge r match no-match dataframe
我有一个更大的现有数据帧.对于这个较小的例子,我想根据"first"列替换一些变量(将state(df1)替换为newstate(df2)).我的问题是值返回为NA,因为只有一些名称在新数据帧(df2)中匹配.
现有数据框:
state = c("CA","WA","OR","AZ")
first = c("Jim","Mick","Paul","Ron")
df1 <- data.frame(first, state)
first state
1 Jim CA
2 Mick WA
3 Paul OR
4 Ron AZ
Run Code Online (Sandbox Code Playgroud)
与现有数据帧匹配的新数据帧
state = c("CA","WA")
newstate = c("TX", "LA")
first =c("Jim","Mick")
df2 <- data.frame(first, state, newstate)
first state newstate
1 Jim CA TX
2 Mick WA LA
Run Code Online (Sandbox Code Playgroud)
试图使用匹配但返回NA为"状态",其中在原始数据帧中找不到与df2匹配的"第一"变量.
df1$state <- df2$newstate[match(df1$first, df2$first)]
first state
1 Jim TX
2 Mick LA
3 Paul <NA>
4 Ron <NA>
Run Code Online (Sandbox Code Playgroud)
有没有办法忽略nomatch或nomatch按原样返回现有变量?这将是期望结果的例子:吉姆/米克的状态得到更新,而保罗和罗恩的状态不会改变.
first state
1 Jim TX
2 Mick LA
3 Paul OR
4 Ron AZ
Run Code Online (Sandbox Code Playgroud)
小智 9
这是你想要的吗; 除非你真的想要使用因子,否则在你的data.frame调用中使用stringsAsFactors = FALSE.注意在匹配调用中使用nomatch = 0.
> state = c("CA","WA","OR","AZ")
> first = c("Jim","Mick","Paul","Ron")
> df1 <- data.frame(first, state, stringsAsFactors = FALSE)
> state = c("CA","WA")
> newstate = c("TX", "LA")
> first =c("Jim","Mick")
> df2 <- data.frame(first, state, newstate, stringsAsFactors = FALSE)
> df1
first state
1 Jim CA
2 Mick WA
3 Paul OR
4 Ron AZ
> df2
first state newstate
1 Jim CA TX
2 Mick WA LA
>
> # create an index for the matches
> indx <- match(df1$first, df2$first, nomatch = 0)
> df1$state[indx != 0] <- df2$newstate[indx]
> df1
first state
1 Jim TX
2 Mick LA
3 Paul OR
4 Ron AZ
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
3524 次 |
| 最近记录: |