`dplyr::arrange()` 对于外语文本不准确:是操作错误还是函数错误?`sort()` 正确执行

Kat*_*Kat 5 r dplyr

使用时dplyr::arrange(),它无法正确排序数据。这是一个例子:

\n
library(dplyr)\n\nwds <- structure(list(\n  Word = c("\xce\xb8\xce\xad\xcf\x83\xce\xb7", "\xcf\x84\xce\xb1\xcf\x87\xcf\x85\xce\xb4\xcf\x81\xce\xbf\xce\xbc\xce\xb5\xce\xaf\xce\xbf", "\xce\xb3\xcf\x81\xce\xb1\xce\xbc\xce\xbc\xce\xb1\xcf\x84\xcf\x8c\xcf\x83\xce\xb7\xce\xbc\xce\xbf", "\xcf\x80\xcf\x81\xce\xbf\xcf\x83\xcf\x80\xce\xac\xce\xb8\xce\xb5\xce\xb9\xce\xb1", "\xcf\x84\xce\xb7\xce\xbb\xce\xb5\xcf\x86\xcf\x89\xce\xbd\xcf\x8e", \n           "\xce\xb5\xcf\x80\xce\xb9\xce\xba\xce\xbf\xce\xb9\xce\xbd\xcf\x89\xce\xbd\xcf\x8e", "\xcf\x85\xcf\x80\xce\xbf\xce\xb3\xcf\x81\xce\xb1\xcf\x86\xce\xae", "\xce\xb3\xce\xbb\xcf\x8e\xcf\x83\xcf\x83\xce\xb1", "\xce\xad\xce\xbe\xce\xbf\xce\xb4\xce\xbf\xcf\x82", "\xce\xbc\xce\xae\xce\xbd\xcf\x85\xce\xbc\xce\xb1", \n           "\xcf\x80\xcf\x81\xce\xb1\xce\xb3\xce\xbc\xce\xb1\xcf\x84\xce\xb9\xce\xba\xcf\x8c\xcf\x84\xce\xb7\xcf\x84\xce\xb1", "\xce\xbc\xcf\x85\xcf\x83\xcf\x84\xce\xb9\xce\xba\xcf\x8c", "\xce\xb4\xce\xb9\xce\xb1\xce\xb4\xce\xaf\xce\xba\xcf\x84\xcf\x85\xce\xbf", "\xce\xb4\xce\xaf\xce\xba\xcf\x84\xcf\x85\xce\xbf"), \n  trans2 = c("the seat, position, job, post, station, status", \n             "the post office, postal service", "the postage stamp", \n             "the attempt (to do something), try (to accomplish something)",\n             "to phone, to telephone (the act of calling someone)", \n             "to talk, interact, contact, communicate", "the signature, (fig) agreement",\n             "the (anatomy, shoe, figuratively) tongue, language, (fish species) sole", \n             "the exit, outing, the act of going out, exodus",\n             "the (lit or fig) message, email, text, SMS", \n             "the reality, actuality", "the secret, mystery", "the internet", \n             "the computer network")), row.names = c(NA, -14L), class = "data.frame")\n\nwds %>% arrange(Word)\n#              Word                                                                  trans2\n# 1          \xce\xad\xce\xbe\xce\xbf\xce\xb4\xce\xbf\xcf\x82                          the exit, outing, the act of going out, exodus\n# 2          \xce\xb3\xce\xbb\xcf\x8e\xcf\x83\xcf\x83\xce\xb1 the (anatomy, shoe, figuratively) tongue, language, (fish species) sole\n# 3    \xce\xb3\xcf\x81\xce\xb1\xce\xbc\xce\xbc\xce\xb1\xcf\x84\xcf\x8c\xcf\x83\xce\xb7\xce\xbc\xce\xbf                                                       the postage stamp\n# 4          \xce\xb4\xce\xaf\xce\xba\xcf\x84\xcf\x85\xce\xbf                                                    the computer network\n# 5       \xce\xb4\xce\xb9\xce\xb1\xce\xb4\xce\xaf\xce\xba\xcf\x84\xcf\x85\xce\xbf                                                            the internet\n# 6      \xce\xb5\xcf\x80\xce\xb9\xce\xba\xce\xbf\xce\xb9\xce\xbd\xcf\x89\xce\xbd\xcf\x8e                                 to talk, interact, contact, communicate\n# 7            \xce\xb8\xce\xad\xcf\x83\xce\xb7                          the seat, position, job, post, station, status\n# 8          \xce\xbc\xce\xae\xce\xbd\xcf\x85\xce\xbc\xce\xb1                              the (lit or fig) message, email, text, SMS\n# 9         \xce\xbc\xcf\x85\xcf\x83\xcf\x84\xce\xb9\xce\xba\xcf\x8c                                                     the secret, mystery\n# 10 \xcf\x80\xcf\x81\xce\xb1\xce\xb3\xce\xbc\xce\xb1\xcf\x84\xce\xb9\xce\xba\xcf\x8c\xcf\x84\xce\xb7\xcf\x84\xce\xb1                                                  the reality, actuality\n# 11     \xcf\x80\xcf\x81\xce\xbf\xcf\x83\xcf\x80\xce\xac\xce\xb8\xce\xb5\xce\xb9\xce\xb1            the attempt (to do something), try (to accomplish something)\n# 12    \xcf\x84\xce\xb1\xcf\x87\xcf\x85\xce\xb4\xcf\x81\xce\xbf\xce\xbc\xce\xb5\xce\xaf\xce\xbf                                         the post office, postal service\n# 13       \xcf\x84\xce\xb7\xce\xbb\xce\xb5\xcf\x86\xcf\x89\xce\xbd\xcf\x8e                     to phone, to telephone (the act of calling someone)\n# 14       \xcf\x85\xcf\x80\xce\xbf\xce\xb3\xcf\x81\xce\xb1\xcf\x86\xce\xae                                          the signature, (fig) agreement \n
Run Code Online (Sandbox Code Playgroud)\n

这是里面的数据Word这是正确排列的列

\n
sort(wds$Word)\n\n#  [1] "\xce\xb3\xce\xbb\xcf\x8e\xcf\x83\xcf\x83\xce\xb1"         "\xce\xb3\xcf\x81\xce\xb1\xce\xbc\xce\xbc\xce\xb1\xcf\x84\xcf\x8c\xcf\x83\xce\xb7\xce\xbc\xce\xbf"   "\xce\xb4\xce\xb9\xce\xb1\xce\xb4\xce\xaf\xce\xba\xcf\x84\xcf\x85\xce\xbf"      "\xce\xb4\xce\xaf\xce\xba\xcf\x84\xcf\x85\xce\xbf"         "\xce\xad\xce\xbe\xce\xbf\xce\xb4\xce\xbf\xcf\x82"        \n#  [6] "\xce\xb5\xcf\x80\xce\xb9\xce\xba\xce\xbf\xce\xb9\xce\xbd\xcf\x89\xce\xbd\xcf\x8e"     "\xce\xb8\xce\xad\xcf\x83\xce\xb7"           "\xce\xbc\xce\xae\xce\xbd\xcf\x85\xce\xbc\xce\xb1"         "\xce\xbc\xcf\x85\xcf\x83\xcf\x84\xce\xb9\xce\xba\xcf\x8c"        "\xcf\x80\xcf\x81\xce\xb1\xce\xb3\xce\xbc\xce\xb1\xcf\x84\xce\xb9\xce\xba\xcf\x8c\xcf\x84\xce\xb7\xcf\x84\xce\xb1"\n# [11] "\xcf\x80\xcf\x81\xce\xbf\xcf\x83\xcf\x80\xce\xac\xce\xb8\xce\xb5\xce\xb9\xce\xb1"     "\xcf\x84\xce\xb1\xcf\x87\xcf\x85\xce\xb4\xcf\x81\xce\xbf\xce\xbc\xce\xb5\xce\xaf\xce\xbf"    "\xcf\x84\xce\xb7\xce\xbb\xce\xb5\xcf\x86\xcf\x89\xce\xbd\xcf\x8e"       "\xcf\x85\xcf\x80\xce\xbf\xce\xb3\xcf\x81\xce\xb1\xcf\x86\xce\xae"\n
Run Code Online (Sandbox Code Playgroud)\n

我认为这个问题与重音符号严格相关。我觉得很奇怪,sort()正确地做而dplyr::arrange没有做。有什么我应该做的不同的事情吗dplyr::arrange()

\n

ste*_*fan 3

我认为原因是用于排序字符向量的区域设置。自从dplyr 1.1.0 arrange默认情况下

\n
\n

使用“C”语言环境,除非 dplyr.legacy_locale 全局选项逃生舱口处于活动状态。(?arrange )

\n
\n

哪个

\n
\n

优点是可以在所有受支持的 R 版本和操作系统上完全重现。(? dplyr-区域设置``)

\n
\n

但是,您可以通过切换到旧行为(或通过显式设置区域设置)来获得与sort或相同的结果order.locale但是,您可以通过切换到将使用系统区域设置的

\n
library(dplyr, warn=FALSE)\n\nwithr::with_options(\n  list(dplyr.legacy_locale = TRUE), wds %>% arrange(Word)\n)\n#>              Word\n#> 1          \xce\xb3\xce\xbb\xcf\x8e\xcf\x83\xcf\x83\xce\xb1\n#> 2    \xce\xb3\xcf\x81\xce\xb1\xce\xbc\xce\xbc\xce\xb1\xcf\x84\xcf\x8c\xcf\x83\xce\xb7\xce\xbc\xce\xbf\n#> 3       \xce\xb4\xce\xb9\xce\xb1\xce\xb4\xce\xaf\xce\xba\xcf\x84\xcf\x85\xce\xbf\n#> 4          \xce\xb4\xce\xaf\xce\xba\xcf\x84\xcf\x85\xce\xbf\n#> 5          \xce\xad\xce\xbe\xce\xbf\xce\xb4\xce\xbf\xcf\x82\n#> 6      \xce\xb5\xcf\x80\xce\xb9\xce\xba\xce\xbf\xce\xb9\xce\xbd\xcf\x89\xce\xbd\xcf\x8e\n#> 7            \xce\xb8\xce\xad\xcf\x83\xce\xb7\n#> 8          \xce\xbc\xce\xae\xce\xbd\xcf\x85\xce\xbc\xce\xb1\n#> 9         \xce\xbc\xcf\x85\xcf\x83\xcf\x84\xce\xb9\xce\xba\xcf\x8c\n#> 10 \xcf\x80\xcf\x81\xce\xb1\xce\xb3\xce\xbc\xce\xb1\xcf\x84\xce\xb9\xce\xba\xcf\x8c\xcf\x84\xce\xb7\xcf\x84\xce\xb1\n#> 11     \xcf\x80\xcf\x81\xce\xbf\xcf\x83\xcf\x80\xce\xac\xce\xb8\xce\xb5\xce\xb9\xce\xb1\n#> 12    \xcf\x84\xce\xb1\xcf\x87\xcf\x85\xce\xb4\xcf\x81\xce\xbf\xce\xbc\xce\xb5\xce\xaf\xce\xbf\n#> 13       \xcf\x84\xce\xb7\xce\xbb\xce\xb5\xcf\x86\xcf\x89\xce\xbd\xcf\x8e\n#> 14       \xcf\x85\xcf\x80\xce\xbf\xce\xb3\xcf\x81\xce\xb1\xcf\x86\xce\xae\n#>                                                                     trans2\n#> 1  the (anatomy, shoe, figuratively) tongue, language, (fish species) sole\n#> 2                                                        the postage stamp\n#> 3                                                             the internet\n#> 4                                                     the computer network\n#> 5                           the exit, outing, the act of going out, exodus\n#> 6                                  to talk, interact, contact, communicate\n#> 7                           the seat, position, job, post, station, status\n#> 8                               the (lit or fig) message, email, text, SMS\n#> 9                                                      the secret, mystery\n#> 10                                                  the reality, actuality\n#> 11            the attempt (to do something), try (to accomplish something)\n#> 12                                         the post office, postal service\n#> 13                     to phone, to telephone (the act of calling someone)\n#> 14                                          the signature, (fig) agreement\n
Run Code Online (Sandbox Code Playgroud)\n