R错误:inherits(x,c("DocumentTermMatrix","TermDocumentMatrix"))不是TRUE

dja*_*216 0 nlp r text-analysis tm

我正在使用以下代码创建文档术语矩阵.我创建矩阵没有问题,但当我尝试删除稀疏术语或查找常用术语时,我收到错误.

text<- c("Since I love to travel, this is what I rely on every time.", 
         "I got this card for the no international transaction fee", 
         "I got this card mainly for the flight perks",
         "Very good card, easy application process",
         "The customer service is outstanding!") 

library(tm)
corpus<- Corpus(VectorSource(text))
corpus<- tm_map(corpus, content_transformer(tolower))
corpus<- tm_map(corpus, removePunctuation)
corpus<- tm_map(corpus, removeWords, stopwords("english"))
corpus<- tm_map(corpus, stripWhitespace)

dtm<- as.matrix(DocumentTermMatrix(corpus))
Run Code Online (Sandbox Code Playgroud)

结果如下:

Docs    application card    customer    easy    every ... etc.
1       0           0       0           1       0
2       0           1       0           0       1
3       0           1       0           0       0
4       1           1       0           0       0
5       0           0       1           0       0
Run Code Online (Sandbox Code Playgroud)

这是我使用removeSparseTermsfindFreqTerms获取错误的地方

sparse<- removeSparseTerms(dtm, 0.80)
freq<- findFreqTerms(dtm, 2)
Run Code Online (Sandbox Code Playgroud)

结果

Error: inherits(x, c("DocumentTermMatrix", "TermDocumentMatrix")) is not TRUE
Run Code Online (Sandbox Code Playgroud)

Jas*_*nM1 5

removeSparseTermsfindFreqTerms期待一个DocumentTermMatrixTermDocumentMatrix对象不是矩阵.

创建DocumentTermMatrix而不转换为矩阵,您将不会收到错误.

dtm <- DocumentTermMatrix(corpus)
sparse <- removeSparseTerms(dtm, 0.80)
freq <- findFreqTerms(dtm, 2)
Run Code Online (Sandbox Code Playgroud)