Car*_*rlo 2 spell-checking r aspell count
Row<-c(1,2,3,4,5)
Content<-c("I love cheese", "whre is the fish", "Final Countdow", "show me your s", "where is what")
Data<-cbind(Row, Content)
View(Data)
我想创建一个函数,告诉我每行有多少单词错误.
一个中间步骤是让它看起来像这样:
Row<-c(1,2,3,4,5)
Content<-c("I love cheese", "whre is the fs", "Final Countdow", "show me your s", "where is     what")
MisspelledWords<-c(NA, "whre, fs", "Countdow","s",NA)
Data<-cbind(Row, Content,MisspelledWords)
我知道我必须使用aspell,但我有问题只在行上执行aspell而不是总是直接在整个文件上,最后我想计算每行有多少单词错误为此我将采取以下代码:计算R中字符串中的单词数?
要使用aspell你必须使用一个文件.使用函数将列转储到文件,运行aspell并获取计数是非常简单的(但如果你有一个大的矩阵/数据帧,它将不是那么有效).
countMispelled <- function(words) {  
  # do a bit of cleanup (if necessary)
  words <- gsub("  *", " ", gsub("[[:punct:]]", "", words))
  temp_file <- tempfile()
  writeLines(words, temp_file);
  res <- aspell(temp_file)
  unlink(temp_file)  
  # return # of mispelled words
  length(res$Original)
}
Data <- cbind(Data, Errors=unlist(lapply(Data[,2], countMispelled)))
Data
##      Row Content             Errors
## [1,] "1" "I love cheese"     "0"   
## [2,] "2" "whre is thed fish" "2"   
## [3,] "3" "Final Countdow"    "1"   
## [4,] "4" "show me your s"    "0"   
## [5,] "5" "where is what"     "0"  
你可能最好使用数据框和矩阵(我只使用你提供的东西),因为你可以保持Row和Errors数字的方式.
受本文启发,这里尝试使用which_misspelled和check_spellingin library(qdap)。
library(qdap)
# which_misspelled
n_misspelled <- sapply(Content, function(x){
  length(which_misspelled(x, suggest = FALSE))
})
data.frame(Content, n_misspelled, row.names = NULL)
#             Content n_misspelled
# 1     I love cheese            0
# 2    whre is the fs            2
# 3    Final Countdow            1
# 4    show me your s            0
# 5 where is     what            0
# check_spelling
df <- check_spelling(Content, n.suggest = 0)                        
n_misspelled <- as.vector(table(factor(df$row, levels = Row)))
data.frame(Content, n_misspelled)
#             Content n_misspelled
# 1     I love cheese            0
# 2    whre is the fs            2
# 3    Final Countdow            1
# 4    show me your s            0
# 5 where is     what            0
| 归档时间: | 
 | 
| 查看次数: | 1419 次 | 
| 最近记录: |