我有这个数据框
d f
"first tweet" A
"second tweet" B
"thrid tweet" C
Run Code Online (Sandbox Code Playgroud)
我想得到这个
d A B C
"first tweet" 1 0 0
"second tweet" 0 1 0
"thrid tweet" 0 0 1
Run Code Online (Sandbox Code Playgroud)
谢谢!
以下是一些需要考虑的选项:
model.matrix
cbind(mydf, model.matrix(~ 0 + f, data = mydf))
# d f fA fB fC
# 1 first tweet A 1 0 0
# 2 second tweet B 0 1 0
# 3 thrid tweet C 0 0 1
Run Code Online (Sandbox Code Playgroud)table
cbind(mydf, as.data.frame.matrix(table(sequence(nrow(mydf)), mydf$f)))
# d f A B C
# 1 first tweet A 1 0 0
# 2 second tweet B 0 1 0
# 3 thrid tweet C 0 0 1
Run Code Online (Sandbox Code Playgroud)dcast 来自"reshape2"
library(reshape2)
dcast(mydf, d ~ f, value.var="f", fun.aggregate=length)
# d A B C
# 1 first tweet 1 0 0
# 2 second tweet 0 1 0
# 3 thrid tweet 0 0 1
Run Code Online (Sandbox Code Playgroud)请注意,前两个选项与第三个选项之间存在差异.如果恰好有列"d"的重复值,则第三个选项将折叠(和制表)值,而前两个选项将逐行拆分值.