R中的unicode字符转换

3 unicode r

我有这个MTST专栏,当印刷产量时

 [1] "<U+0391>G<U+03A1><U+0399><U+039D><U+0399><U+039F>                                 "
 [2] "<U+0391>G<U+03A7><U+0399><U+0391><U+039B><U+039F>S                                "
 [3] "<U+0391><U+0399>G<U+0399><U+039D><U+0391>                                  "
 [4] "<U+0391><U+0399>G<U+0399><U+039F>                                   "
 [5] "<U+0391><U+0399><U+0394><U+0397><U+03A8><U+039F>S                                 "
 [6] "<U+0391><U+039A><U+03A4><U+0399><U+039F>(<U+03A0><U+03A1><U+0395><U+0392><U+0395><U+0396><U+0391>)                          "
 [7] "<U+0391><U+039B><U+0395><U+039E><U+0391><U+039D><U+0394><U+03A1><U+039F><U+03A5><U+03A0><U+039F><U+039B><U+0397>                          "
 [8] "<U+0391><U+039B><U+0399><U+0391><U+03A1><U+03A4><U+039F>S                                "
Run Code Online (Sandbox Code Playgroud)

我尝试使用Unicode库并做到MTST<- as.u_char(MTST)这一点

[1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
Run Code Online (Sandbox Code Playgroud)

我也试着dumpdput,但什么都没有改变.

请注意,MTST最初是类型character.

感谢您的帮助.谢谢

编辑:下面dput(MTST)示出了

c("<U+0391>G<U+03A1><U+0399><U+039D><U+0399><U+039F>                                 ",
"<U+0391>G<U+03A7><U+0399><U+0391><U+039B><U+039F>S                                ",
"<U+0391><U+0399>G<U+0399><U+039D><U+0391>                                  ",
"<U+0391><U+0399>G<U+0399><U+039F>                                   ",
"<U+0391><U+0399><U+0394><U+0397><U+03A8><U+039F>S                                 ",
"<U+0391><U+039A><U+03A4><U+0399><U+039F>(<U+03A0><U+03A1><U+0395><U+0392><U+0395><U+0396><U+0391>)                          ",
"<U+0391><U+039B><U+0395><U+039E><U+0391><U+039D><U+0394><U+03A1><U+039F><U+03A5><U+03A0><U+039F><U+039B><U+0397>                          ",
"<U+0391><U+039B><U+0399><U+0391><U+03A1><U+03A4><U+039F>S                                ",
"<U+0391><U+039D><U+0391><U+0392><U+03A1><U+03A5><U+03A4><U+0391>                                ",
"<U+0391><U+039D><U+0394><U+03A1><U+0391><U+0392><U+0399><U+0394><U+0391>                               ",
"<U+0391><U+039D>OG<U+0395><U+0399><U+0391>                                 ",
"<U+0391><U+03A1><U+0391><U+039E><U+039F>S                                  ",
"<U+0391><U+03A1><U+0391><U+03A7>O<U+0392><U+0391>                                 ",
"<U+0391><U+03A1>G<U+039F>S(<U+03A0><U+03A5><U+03A1>G<U+0395><U+039B><U+0391>)                          ",
"<U+0391><U+03A1>G<U+039F>S<U+03A4><U+039F><U+039B><U+0399>                               ",
"<U+0391><U+03A1><U+03A4><U+0391> (<U+03A0><U+039F><U+039B><U+0397>)                             ",
"<U+0391><U+03A1><U+03A4><U+0391> (F<U+0399><U+039B><U+039F>T<U+0395><U+0397>)                          ",
"<U+0391>S<U+03A4><U+0395><U+03A1><U+039F>S<U+039A><U+039F><U+03A0><U+0395><U+0399><U+039F>                           ",
"<U+0391>S<U+03A4><U+03A1><U+039F>S                                  ",
"<U+0391>S<U+03A4><U+03A5><U+03A0><U+0391><U+039B><U+0391><U+0399><U+0391>                              ",
"<U+0392><U+0391><U+039C><U+039F>S                                   ",
"<U+0392><U+0395><U+039B><U+039F> (<U+039A><U+039F><U+03A1><U+0399><U+039D>T<U+0399><U+0391>S)                        ",
"<U+0392><U+039F><U+039B><U+039F>S                                   ",
"<U+0392><U+03A5><U+03A4><U+0399><U+039D><U+0391>                                  ",
"G<U+039F><U+03A1><U+03A4><U+03A5>S                                  ",
"G<U+03A5>T<U+0395><U+0399><U+039F>                                  ",
"<U+0394><U+0395>SF<U+0399><U+039D><U+0391>                                 ",
"<U+0394><U+0399><U+0391><U+0392><U+039F><U+039B><U+0399><U+03A4>S<U+0399>                              ",
"<U+0394><U+039F><U+039C><U+039F><U+039A><U+039F>S                                 ",
"<U+0394><U+03A1><U+0391><U+039C><U+0391>                                   ",
"<U+0395><U+0394><U+0395>SS<U+0391>                                  ",
"<U+0395><U+039B><U+0395><U+03A5>S<U+0399><U+039D><U+0391>                                ",
"<U+0395><U+039B><U+039B><U+0397><U+039D><U+0399><U+039A><U+039F> ae<U+03C1>                            ",
"<U+0396><U+0391><U+039A><U+03A5><U+039D>T<U+039F>S                                ",
"<U+0396><U+0391><U+039A><U+03A5><U+039D>T<U+039F>S_<U+03A0><U+039F><U+039B><U+0397>                           ",
"<U+0396><U+0391><U+03A1><U+039F>S                                   ",
"<U+0397><U+03A1><U+0391><U+039A><U+039B><U+0395><U+0399><U+039F>                                ",
"T<U+0391>S<U+039F>S                                   ", "T<U+0397><U+03A1><U+0391> (S<U+0391><U+039D><U+03A4><U+039F><U+03A1><U+0399><U+039D><U+0397>",
"<U+0399><U+0395><U+03A1><U+0391><U+03A0><U+0395><U+03A4><U+03A1><U+0391>                               ",
"<U+0399><U+039A><U+0391><U+03A1><U+0399><U+0391>_<U+0391>/<U+0394>                              ",
"<U+0399>O<U+0391><U+039D><U+039D><U+0399><U+039D><U+0391>                                ",
"<U+039A><U+0391><U+0392><U+0391><U+039B><U+0391> (<U+03A0><U+039F><U+039B><U+0397>)                           ",
"<U+039A><U+0391><U+0392><U+0391><U+039B><U+0391>(<U+0391><U+039C><U+03A5>G<U+0394><U+0391><U+039B><U+0395>O<U+039D><U+0391>S)                    ",
"<U+039A><U+0391><U+039B><U+0391><U+0392><U+03A1><U+03A5><U+03A4><U+0391>                               ",
"<U+039A><U+0391><U+039B><U+0391><U+039C><U+0391><U+03A4><U+0391>                                ",
"<U+039A><U+0391><U+039B><U+0391><U+039C><U+03A0><U+0391><U+039A><U+0391>                               ",
"<U+039A><U+0391><U+03A1><U+0394><U+0399><U+03A4>S<U+0391>                                ",
"<U+039A><U+0391><U+03A1><U+03A0><U+0391>T<U+039F>S_<U+0391>/<U+0394>                            ",
"<U+039A><U+0391><U+03A1><U+03A0><U+0391>T<U+039F>S_<U+03A0><U+039F><U+039B><U+0397>                           ",
"<U+039A><U+0391><U+03A1><U+03A0><U+0395><U+039D><U+0397>S<U+0399>                               ",
"<U+039A><U+0391><U+03A1><U+03A5>S<U+03A4><U+039F>S                                ",
"<U+039A><U+0391>S<U+039F>S                                   ",
"<U+039A><U+0391>S<U+03A4><U+0395><U+039B><U+039B><U+0399>                                ",
"<U+039A><U+0391>S<U+03A4><U+039F><U+03A1><U+0399><U+0391>                                ",
"<U+039A><U+0395><U+03A1><U+039A><U+03A5><U+03A1><U+0391>                                 ",
"<U+039A><U+039F><U+0396><U+0391><U+039D><U+0397>                                  ",
"<U+039A><U+039F><U+039C><U+039F><U+03A4><U+0397><U+039D><U+0397>                                ",
"<U+039A><U+039F><U+039D><U+0399><U+03A4>S<U+0391>                                 ",
"<U+039A><U+039F><U+03A1><U+0399><U+039D>T<U+039F>S                                ",
"<U+039A><U+03A5>T<U+0397><U+03A1><U+0391>_<U+0391>/<U+0394>                              ",
"<U+039A><U+03A5><U+039C><U+0397>                                    ",
"<U+039A>OS                                     ", "<U+039A>OS_<U+03A0><U+039F><U+039B><U+0397>                                ",
"<U+039B><U+0391><U+039C><U+0399><U+0391>                                   ",
"<U+039B><U+0391><U+03A1><U+0399>S<U+0391>                                  ",
"<U+039B><U+0395><U+03A1><U+039F>S                                   ",
"<U+039B><U+0395><U+03A5><U+039A><U+0391><U+0394><U+0391> (<U+039D><U+0397>S<U+0399>)                          ",
"<U+039B><U+0395>O<U+039D><U+0399><U+0394><U+0399><U+039F>                                ",
"<U+039B><U+0397><U+039C><U+039D><U+039F>S                                  ",
"<U+039B><U+0399><U+0394>O<U+03A1><U+0399><U+039A><U+0399>                                ",
"<U+039C><U+0391><U+039A><U+0395><U+0394><U+039F><U+039D><U+0399><U+0391>                               ",
"<U+039C><U+0391><U+03A1><U+0391>TO<U+039D><U+0391>S                               ",
"<U+039C><U+0395>TO<U+039D><U+0397>                                  ",
"<U+039C><U+0395>S<U+039F><U+039B><U+039F>GG<U+0399>                               ",
"<U+039C><U+0397><U+039B><U+039F>S_<U+0391><U+039C>S                               ",
"<U+039C><U+03A5><U+039A><U+039F><U+039D><U+039F>S                                 ",
"<U+039C><U+03A5><U+03A4><U+0399><U+039B><U+0397><U+039D><U+0397>                                ",
"<U+039D><U+0391><U+039E><U+039F>S                                   ",
"<U+039D><U+0391><U+03A5><U+03A0><U+0391><U+039A><U+03A4><U+039F>S                               ",
"<U+039D><U+0391><U+03A5><U+03A0><U+039B><U+0399><U+039F>                                 ",
"<U+039D><U+0395><U+0391> F<U+0399><U+039B><U+0391><U+0394><U+0395><U+039B>F<U+0395><U+0399><U+0391>                         ",
"<U+039E><U+0391><U+039D>T<U+0397>                                   ",
"<U+039F><U+03A1><U+0395>S<U+03A4><U+0399><U+0391><U+0394><U+0391>                               ",
"<U+03A0><U+0391><U+0399><U+0391><U+039D><U+0399><U+0391>                                 ",
"<U+03A0><U+0391><U+039B><U+0391><U+0399><U+039F><U+03A7>O<U+03A1><U+0391>                              ",
"<U+03A0><U+0391><U+03A1><U+039F>S_<U+0391>/<U+0394>                               ",
"<U+03A0><U+0391><U+03A4><U+03A1><U+0391>                                   ",
"<U+03A0><U+0395><U+0399><U+03A1><U+0391><U+0399><U+0391>S                                ",
"<U+03A0><U+039F><U+039B><U+03A5>G<U+03A5><U+03A1><U+039F>S                               ",
"<U+03A0><U+039F><U+03A4><U+0399><U+0394><U+0391><U+0399><U+0391>                                ",
"<U+03A0><U+03A4><U+039F><U+039B><U+0395><U+039C><U+0391><U+0399><U+0394><U+0391>                              ",
"<U+03A0><U+03A5><U+03A1>G<U+039F>S                                  ",
"<U+03A1><U+0391>F<U+0397><U+039D><U+0391>                                  ",
"<U+03A1><U+0395>T<U+03A5><U+039C><U+039D><U+039F>                                 ",
"<U+03A1><U+039F><U+0394><U+039F>S                                   ",
"S<U+0391><U+039C><U+039F>S                                   ",
"S<U+0395><U+0394><U+0395>S                                   ",
"S<U+0395><U+03A1><U+03A1><U+0395>S                                  ",
"S<U+0397><U+03A4><U+0395><U+0399><U+0391>                                  ",
"S<U+039A><U+0399><U+0391>T<U+039F>S                                 ",
"S<U+039A><U+039F><U+03A4><U+0399><U+039D><U+0391>                                 ",
"S<U+039A><U+03A5><U+03A1><U+039F>S                                  ",
"S<U+039F><U+03A5><U+0394><U+0391>                                   ",
"S<U+039F><U+03A5>F<U+039B><U+0399>                                  ",
"S<U+03A0><U+0391><U+03A1><U+03A4><U+0397>                                  ",
"S<U+03A0><U+0391><U+03A4><U+0391>(<U+0392><U+0395><U+039D><U+0399><U+0396><U+0395><U+039B><U+039F>S)                        ",
"S<U+03A0><U+0395><U+03A4>S<U+0395>S                                 ",
"S<U+03A4><U+0395>F<U+0391><U+039D><U+0399> (<U+039A><U+039F><U+03A1><U+0399><U+039D>T<U+0399><U+0391>S)                     ",
"S<U+03A5><U+039A><U+03A5>O<U+039D><U+0391>                                 ",
"S<U+03A5><U+03A1><U+039F>S_<U+0391>/<U+0394>                               ",
"<U+03A4><U+0391><U+039D><U+0391>G<U+03A1><U+0391>                                 ",
"<U+03A4><U+0391><U+03A4><U+039F><U+0399> (<U+0394><U+0395><U+039A><U+0395><U+039B><U+0395><U+0399><U+0391>)                        ",
"<U+03A4><U+0396><U+0395><U+03A1><U+039C><U+0399><U+0391><U+0394><U+0395>S                              ",
"<U+03A4><U+03A1><U+0399><U+039A><U+0391><U+039B><U+0391> <U+0397><U+039C><U+0391>T<U+0395><U+0399><U+0391>S                        ",
"<U+03A4><U+03A1><U+0399><U+039A><U+0391><U+039B><U+0391> T<U+0395>SS<U+0391><U+039B><U+0399><U+0391>S                       ",
"<U+03A4><U+03A1><U+0399><U+03A0><U+039F><U+039B><U+0397>                                 ",
"<U+03A4><U+03A5><U+039C><U+03A0><U+0391><U+039A><U+0399>                                 ",
"<U+03A4><U+03A5><U+03A1><U+0399><U+039D>T<U+0391>                                 ",
"F<U+0391><U+03A1>S<U+0391><U+039B><U+0391>                                 ",
"F<U+039B>O<U+03A1><U+0399><U+039D><U+0391>                                 ",
"F<U+039F><U+03A5><U+03A1><U+039D><U+0397>                                  ",
"F<U+03A5><U+03A7><U+03A4><U+0399><U+0391>                                  ",
"<U+03A7><U+0391><U+039B><U+039A><U+0399><U+0394><U+0391>                                 ",
"<U+03A7><U+0391><U+039D><U+0399><U+0391>                                   ",
"<U+03A7><U+0399><U+039F>S                                    ",
"<U+03A7><U+03A1><U+03A5>S<U+039F><U+03A5><U+03A0><U+039F><U+039B><U+0397>_<U+039A><U+0391><U+0392><U+0391><U+039B><U+0391>                       ",
"O<U+03A1><U+0395><U+039F><U+0399>                                   "
)
Run Code Online (Sandbox Code Playgroud)

Spa*_*man 7

你在那里看起来像普通的7位ASCII字符,有些尝试通过包装其中的一些来编码Unicode代码点:<U+abcd>.

据我所知,这不是Unicode的公认编码,部分原因是你如何<在文本中添加真实内容?我想每个人<都可以<U+jklm>jklm一个尖括号的代码......但是ick.

所以,首先,尝试从生成此ascii编码的混乱中获取UTF-8编码的字符串!

然而...经过一些严肃的头发拉动......

stringi救援!'MTST'是你的东西向量,首先将尖括号表示法转换为反斜杠-u然后使用stri_unescape_unicode:

> require(stringi)
> greek2=gsub(">","", gsub("<U\\+","\\\\u",MTST))
> stri_unescape_unicode(greek2)
[1] "?G?????                                 "
[2] "?G?????S                                "
[3] "??G???                                  "
[4] "??G??                                   "
[5] "??????S                                 "
[6] "?????(???????)                          "
Run Code Online (Sandbox Code Playgroud)

一路走来

[123] "F?????                                  "
[124] "???????                                 "
[125] "?????                                   "
[126] "???S                                    "
[127] "???S??????_??????                       "
[128] "O????                                   "
Run Code Online (Sandbox Code Playgroud)

一旦我在你的"dput"数据中修复了奇怪的缺失逗号和引号(为你编辑了你的问题).