mar*_*ary 3 split r dataframe tm
对于文本挖掘项目,我必须调查单词列表随着时间的推移的发展。为此,我需要拆分行名称,以便有一列包含公司名称,一列包含年份。这是我的数据框中的摘录:
abs access allow analysis application approach base big business challenge company
Adidas_2010.txt 13 25 26 11 41 132 1 266 13 115 1
Adidas_2011.txt 1 3 1 0 0 8 0 11 2 10 0
Adidas_2012.txt 29 35 37 22 110 181 7 384 31 136 3
Adidas_2013.txt 28 47 38 32 180 184 4 451 30 129 3
Adidas_2014.txt 12 42 38 27 159 207 6 921 32 128 6
Adidas_2016.txt 30 47 50 47 162 251 9 1061 32 171 13
Nike_2009.txt 16 15 17 12 33 177 9 346 93 196 1
Nike_2011.txt 10 30 0 3 0 0 0 81 7 31 0
Nike_2012.txt 21 22 12 57 199 300 7 214 11 107 3
Nike_2013.txt 20 32 30 11 123 321 4 331 90 239 3
Nike_2014.txt 33 43 30 33 119 137 6 441 67 318 6
Nike_2015.txt 51 42 41 27 102 151 9 1061 32 221 13
Run Code Online (Sandbox Code Playgroud)
这是我的代码:
dtm <- DocumentTermMatrix(corpus, control=list(dictionary = word_list))
df1 <- data.frame(as.matrix(dtm), row.names = filenames_annualreports)
Run Code Online (Sandbox Code Playgroud)
我试过这个:
names_plus_year <- rownames(df1)
names_plus_year_split <- strsplit(names_plus_year, "_")
rownames(df1) <- sapply(names_plus_year_split, "[", 1)
Run Code Online (Sandbox Code Playgroud)
但我收到以下错误:
Error in `.rowNamesDF<-`(x, value = value) :
double 'row.names' not allowed
Run Code Online (Sandbox Code Playgroud)
还有其他方法来分割行名吗?多谢!:)
嗨,玛丽,使用 @Sotos 数据
library(tidyverse)
new_df <- df %>%
rownames_to_column(var = "row_name") %>%
separate(row_name,sep = "_",into = c("name","year")) %>%
mutate(year = year %>% str_remove(".txt"))
new_df %>% as_tibble()
# A tibble: 12 x 13
name year abs access allow analysis application approach base big business challenge company
<chr> <chr> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 Adidas 2010 13 25 26 11 41 132 1 266 13 115 1
2 Adidas 2011 1 3 1 0 0 8 0 11 2 10 0
3 Adidas 2012 29 35 37 22 110 181 7 384 31 136 3
4 Adidas 2013 28 47 38 32 180 184 4 451 30 129 3
5 Adidas 2014 12 42 38 27 159 207 6 921 32 128 6
6 Adidas 2016 30 47 50 47 162 251 9 1061 32 171 13
7 Nike 2009 16 15 17 12 33 177 9 346 93 196 1
8 Nike 2011 10 30 0 3 0 0 0 81 7 31 0
9 Nike 2012 21 22 12 57 199 300 7 214 11 107 3
10 Nike 2013 20 32 30 11 123 321 4 331 90 239 3
11 Nike 2014 33 43 30 33 119 137 6 441 67 318 6
12 Nike 2015 51 42 41 27 102 151 9 1061 32 221 13
Run Code Online (Sandbox Code Playgroud)