Mar*_*lov 2 csv parsing split r
我有一些来自民意调查的数据,如下所示:
Freetime_activities
1 Travelling, On the PC, Clubbing
2 Sports, On the PC, Clubbing
3 Clubbing
4 On the PC
5 Travelling, On the PC, Clubbing
6 On the PC
7 Watching TV, Travelling
Run Code Online (Sandbox Code Playgroud)
我想得到每个值的计数(在PC上等多少次行/ PC),但是我在分割值时遇到了麻烦.R中是否有一个函数可以做例如:
split("A,B,C") ->
1 A
2 B
3 C
Run Code Online (Sandbox Code Playgroud)
或者是否有直接从列中计算值的直接解决方案?
我们可以用strsplit分割用分隔栏", "),unlist将list输出,然后用table获得的频率
tbl <- table(unlist(strsplit(as.character(df1$Freetime_activities),
", ")))
as.data.frame(tbl)
# Var1 Freq
#1 Clubbing 4
#2 On the PC 5
#3 Sports 1
#4 Travelling 3
#5 Watching TV 1
Run Code Online (Sandbox Code Playgroud)
注意:as.character如果列是a factor,strsplit只能使用character向量,则使用此处.
或者另一种选择是使用scan提取元素,然后table获取频率.
table(trimws(scan(text = as.character(df1$Freetime_activities),
what = "", sep = ",")))
Run Code Online (Sandbox Code Playgroud)
或者使用read.table与unlist和table
table(unlist(read.table(text = as.character(df1$Freetime_activities),
sep = ",", fill = TRUE, strip.white = TRUE)))
Run Code Online (Sandbox Code Playgroud)
编辑:基于@David Arenburg的评论.
df1 <- structure(list(Freetime_activities = c("Travelling, On the PC,
Clubbing",
"Sports, On the PC, Clubbing", "Clubbing", "On the PC", "Travelling,
On the PC, Clubbing",
"On the PC", "Watching TV, Travelling")),
.Names = "Freetime_activities",
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7"))
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
54 次 |
| 最近记录: |