错误:“filter()”输入“..1”有问题

Jay*_*e K 5 r n-gram shiny dplyr shinyapps

我正在编写一个函数,将其合并到闪亮的应用程序中,该应用程序可以从一组预定义的文件中预测下一个单词。当我创建使用 ngram 预测下一个单词的函数时,

我遇到了这个错误


x object of type 'closure' is not subsettable
i Input ..1 is top_n_rank(1, n).

Run rlang::last_error() to see where the error occurred.

In addition: Warning message:
In is.na(x) : is.na() applied to non-(list or vector) of type 'closure'
Run Code Online (Sandbox Code Playgroud)

这是我的 R 程序。我已经在另一个 R 脚本中创建了二元三元和四元词,并将其保存为我在此处使用的 rds 文件

library(tidyverse)
library(stringr)
library(dplyr)
library(ngram)
library(tidyr)

bi_words <- readRDS("./bi_words.rds")
tri_words <- readRDS("./tri_words.rds")
quad_words <- readRDS("./quad_words.rds")

bigram <- function(input_words){
        num <- length(input_words)
        dplyr::filter(bi_words, 
               word1==input_words[num]) %>% 
                top_n(1, n) %>%
                filter(row_number() == 1L) %>%
                select(num_range("word", 2)) %>%
                as.character() -> out
        ifelse(out =="character(0)", "?", return(out))
}

trigram <- function(input_words){
        num <- length(input_words)
        dplyr::filter(tri_words, 
               word1==input_words[num-1], 
               word2==input_words[num])  %>% 
                top_n(1, n) %>%
                filter(row_number() == 1L) %>%
                select(num_range("word", 3)) %>%
                as.character() -> out
        ifelse(out=="character(0)", bigram(input_words), return(out))
}

quadgram <- function(input_words){
        num <- length(input_words)
        dplyr::filter(quad_words, 
               word1==input_words[num-2], 
               word2==input_words[num-1], 
               word3==input_words[num])  %>% 
                top_n(1, n) %>%
                filter(row_number() == 1L) %>%
                select(num_range("word", 4)) %>%
                as.character() -> out
        ifelse(out=="character(0)", trigram(input_words), return(out))
}

ngrams <- function(input){
        # Create a dataframe
        input <- data.frame(text = input)
        # Clean the Inpput
        replace_reg <- "[^[:alpha:][:space:]]*"
        input <- input %>%
                mutate(text = str_replace_all(text, replace_reg, ""))
        # Find word count, separate words, lower case
        input_count <- str_count(input, boundary("word"))
        input_words <- unlist(str_split(input, boundary("word")))
        input_words <- tolower(input_words)
        # Call the matching functions
        out <- ifelse(input_count == 1, bigram(input_words), 
                      ifelse (input_count == 2, trigram(input_words), quadgram(input_words)))
        # Output
        return(out)
}

input <- "In case of a"
ngrams(input)

Run Code Online (Sandbox Code Playgroud)

这是quad_words.rds 的片段

And*_*ter 1

也许这里缺少的步骤是在选择最上面的 ngram 之前计算每种情况下哪个 ngram 是最常见的。一个简单的解决方案是用 inadd_count代替top_n

filter(quad_words, 
       word1==input_words[num-2], 
       word2==input_words[num-1], 
       word3==input_words[num])  %>%
  add_count(word4, sort = TRUE) %>% 
  filter(row_number() == 1L) %>%
  select(num_range("word", 4)) %>%
  as.character() -> out
ifelse(out=="character(0)", trigram(input_words), return(out))
Run Code Online (Sandbox Code Playgroud)

...作为四元组调用的中心部分。调用word4对单词 1-3 进行过滤后计算最常见的第四个单词。该sort = TRUE参数使最高频率四元图出现在第 1 行,然后您的下一行将选择该行。希望这是一个有用的步骤 - 如果这解决了这个特定问题,请跟进任何问题或更正或标记为已完成。