从 R 中的字符串中提取拼写出的数字

我正在尝试从字符串中提取拼写出来的数字，并提取数字后面的单词。我设法通过一种费力的方式编写自己的代码来做到这一点，包括要搜索的拼写数字（这里是一个示例stringr::sentences：

numbers <- str_c(c(" one ", " two ", " three ", " four ", " five ", " six ", " seven ", " eight "," nine ", " ten "), "([^ ]+)")
number_match <- str_c(numbers, collapse = "|")

reduced <- sentences %>%
   str_detect(number_match)
sent <- sentences[reduced==TRUE]
str_extract(sent, number_match)

Run Code Online (Sandbox Code Playgroud)

这些是提取的字符串：

 [1] " seven books"   " two met"       " two factors"   " three lists"   " seven is"      " two when"      " ten inches."   " one war"      
 [9] " one button"    " six minutes."  " ten years"     " two shares"    " two distinct"  " five cents"    " two pins"      " five robins." 
[17] " four kinds"    " three story"   " three inches"  " six comes"     " three batches" " two leaves."

Run Code Online (Sandbox Code Playgroud)

由于我不可能预先知道是否考虑了所有可能的数字，因此我想知道 R 是否提供了可以识别拼写出来的数字的工具？我发现了类似的问题，例如将拼写出来的数字转换为数字，但不幸的是这不是关于 R 的问题。

任何帮助表示赞赏。

归档时间：	7 年，11 月前
查看次数：	370 次
最近记录：	7 年，11 月前