我有这个MTST专栏,当印刷产量时
[1] "<U+0391>G<U+03A1><U+0399><U+039D><U+0399><U+039F> "
[2] "<U+0391>G<U+03A7><U+0399><U+0391><U+039B><U+039F>S "
[3] "<U+0391><U+0399>G<U+0399><U+039D><U+0391> "
[4] "<U+0391><U+0399>G<U+0399><U+039F> "
[5] "<U+0391><U+0399><U+0394><U+0397><U+03A8><U+039F>S "
[6] "<U+0391><U+039A><U+03A4><U+0399><U+039F>(<U+03A0><U+03A1><U+0395><U+0392><U+0395><U+0396><U+0391>) "
[7] "<U+0391><U+039B><U+0395><U+039E><U+0391><U+039D><U+0394><U+03A1><U+039F><U+03A5><U+03A0><U+039F><U+039B><U+0397> "
[8] "<U+0391><U+039B><U+0399><U+0391><U+03A1><U+03A4><U+039F>S "
Run Code Online (Sandbox Code Playgroud)
我尝试使用Unicode库并做到MTST<- as.u_char(MTST)这一点
[1] <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
Run Code Online (Sandbox Code Playgroud)
我也试着dump和dput,但什么都没有改变.
请注意,MTST最初是类型character.
感谢您的帮助.谢谢
编辑:下面dput(MTST)示出了
c("<U+0391>G<U+03A1><U+0399><U+039D><U+0399><U+039F> ",
"<U+0391>G<U+03A7><U+0399><U+0391><U+039B><U+039F>S ",
"<U+0391><U+0399>G<U+0399><U+039D><U+0391> ",
"<U+0391><U+0399>G<U+0399><U+039F> ",
"<U+0391><U+0399><U+0394><U+0397><U+03A8><U+039F>S ",
"<U+0391><U+039A><U+03A4><U+0399><U+039F>(<U+03A0><U+03A1><U+0395><U+0392><U+0395><U+0396><U+0391>) ",
"<U+0391><U+039B><U+0395><U+039E><U+0391><U+039D><U+0394><U+03A1><U+039F><U+03A5><U+03A0><U+039F><U+039B><U+0397> ",
"<U+0391><U+039B><U+0399><U+0391><U+03A1><U+03A4><U+039F>S ",
"<U+0391><U+039D><U+0391><U+0392><U+03A1><U+03A5><U+03A4><U+0391> ",
"<U+0391><U+039D><U+0394><U+03A1><U+0391><U+0392><U+0399><U+0394><U+0391> ",
"<U+0391><U+039D>OG<U+0395><U+0399><U+0391> ",
"<U+0391><U+03A1><U+0391><U+039E><U+039F>S ",
"<U+0391><U+03A1><U+0391><U+03A7>O<U+0392><U+0391> ",
"<U+0391><U+03A1>G<U+039F>S(<U+03A0><U+03A5><U+03A1>G<U+0395><U+039B><U+0391>) ",
"<U+0391><U+03A1>G<U+039F>S<U+03A4><U+039F><U+039B><U+0399> ",
"<U+0391><U+03A1><U+03A4><U+0391> (<U+03A0><U+039F><U+039B><U+0397>) ",
"<U+0391><U+03A1><U+03A4><U+0391> (F<U+0399><U+039B><U+039F>T<U+0395><U+0397>) ",
"<U+0391>S<U+03A4><U+0395><U+03A1><U+039F>S<U+039A><U+039F><U+03A0><U+0395><U+0399><U+039F> ",
"<U+0391>S<U+03A4><U+03A1><U+039F>S ",
"<U+0391>S<U+03A4><U+03A5><U+03A0><U+0391><U+039B><U+0391><U+0399><U+0391> ",
"<U+0392><U+0391><U+039C><U+039F>S ",
"<U+0392><U+0395><U+039B><U+039F> (<U+039A><U+039F><U+03A1><U+0399><U+039D>T<U+0399><U+0391>S) ",
"<U+0392><U+039F><U+039B><U+039F>S ",
"<U+0392><U+03A5><U+03A4><U+0399><U+039D><U+0391> ",
"G<U+039F><U+03A1><U+03A4><U+03A5>S ",
"G<U+03A5>T<U+0395><U+0399><U+039F> ",
"<U+0394><U+0395>SF<U+0399><U+039D><U+0391> ",
"<U+0394><U+0399><U+0391><U+0392><U+039F><U+039B><U+0399><U+03A4>S<U+0399> ",
"<U+0394><U+039F><U+039C><U+039F><U+039A><U+039F>S ",
"<U+0394><U+03A1><U+0391><U+039C><U+0391> ",
"<U+0395><U+0394><U+0395>SS<U+0391> ",
"<U+0395><U+039B><U+0395><U+03A5>S<U+0399><U+039D><U+0391> ",
"<U+0395><U+039B><U+039B><U+0397><U+039D><U+0399><U+039A><U+039F> ae<U+03C1> ",
"<U+0396><U+0391><U+039A><U+03A5><U+039D>T<U+039F>S ",
"<U+0396><U+0391><U+039A><U+03A5><U+039D>T<U+039F>S_<U+03A0><U+039F><U+039B><U+0397> ",
"<U+0396><U+0391><U+03A1><U+039F>S ",
"<U+0397><U+03A1><U+0391><U+039A><U+039B><U+0395><U+0399><U+039F> ",
"T<U+0391>S<U+039F>S ", "T<U+0397><U+03A1><U+0391> (S<U+0391><U+039D><U+03A4><U+039F><U+03A1><U+0399><U+039D><U+0397>",
"<U+0399><U+0395><U+03A1><U+0391><U+03A0><U+0395><U+03A4><U+03A1><U+0391> ",
"<U+0399><U+039A><U+0391><U+03A1><U+0399><U+0391>_<U+0391>/<U+0394> ",
"<U+0399>O<U+0391><U+039D><U+039D><U+0399><U+039D><U+0391> ",
"<U+039A><U+0391><U+0392><U+0391><U+039B><U+0391> (<U+03A0><U+039F><U+039B><U+0397>) ",
"<U+039A><U+0391><U+0392><U+0391><U+039B><U+0391>(<U+0391><U+039C><U+03A5>G<U+0394><U+0391><U+039B><U+0395>O<U+039D><U+0391>S) ",
"<U+039A><U+0391><U+039B><U+0391><U+0392><U+03A1><U+03A5><U+03A4><U+0391> ",
"<U+039A><U+0391><U+039B><U+0391><U+039C><U+0391><U+03A4><U+0391> ",
"<U+039A><U+0391><U+039B><U+0391><U+039C><U+03A0><U+0391><U+039A><U+0391> ",
"<U+039A><U+0391><U+03A1><U+0394><U+0399><U+03A4>S<U+0391> ",
"<U+039A><U+0391><U+03A1><U+03A0><U+0391>T<U+039F>S_<U+0391>/<U+0394> ",
"<U+039A><U+0391><U+03A1><U+03A0><U+0391>T<U+039F>S_<U+03A0><U+039F><U+039B><U+0397> ",
"<U+039A><U+0391><U+03A1><U+03A0><U+0395><U+039D><U+0397>S<U+0399> ",
"<U+039A><U+0391><U+03A1><U+03A5>S<U+03A4><U+039F>S ",
"<U+039A><U+0391>S<U+039F>S ",
"<U+039A><U+0391>S<U+03A4><U+0395><U+039B><U+039B><U+0399> ",
"<U+039A><U+0391>S<U+03A4><U+039F><U+03A1><U+0399><U+0391> ",
"<U+039A><U+0395><U+03A1><U+039A><U+03A5><U+03A1><U+0391> ",
"<U+039A><U+039F><U+0396><U+0391><U+039D><U+0397> ",
"<U+039A><U+039F><U+039C><U+039F><U+03A4><U+0397><U+039D><U+0397> ",
"<U+039A><U+039F><U+039D><U+0399><U+03A4>S<U+0391> ",
"<U+039A><U+039F><U+03A1><U+0399><U+039D>T<U+039F>S ",
"<U+039A><U+03A5>T<U+0397><U+03A1><U+0391>_<U+0391>/<U+0394> ",
"<U+039A><U+03A5><U+039C><U+0397> ",
"<U+039A>OS ", "<U+039A>OS_<U+03A0><U+039F><U+039B><U+0397> ",
"<U+039B><U+0391><U+039C><U+0399><U+0391> ",
"<U+039B><U+0391><U+03A1><U+0399>S<U+0391> ",
"<U+039B><U+0395><U+03A1><U+039F>S ",
"<U+039B><U+0395><U+03A5><U+039A><U+0391><U+0394><U+0391> (<U+039D><U+0397>S<U+0399>) ",
"<U+039B><U+0395>O<U+039D><U+0399><U+0394><U+0399><U+039F> ",
"<U+039B><U+0397><U+039C><U+039D><U+039F>S ",
"<U+039B><U+0399><U+0394>O<U+03A1><U+0399><U+039A><U+0399> ",
"<U+039C><U+0391><U+039A><U+0395><U+0394><U+039F><U+039D><U+0399><U+0391> ",
"<U+039C><U+0391><U+03A1><U+0391>TO<U+039D><U+0391>S ",
"<U+039C><U+0395>TO<U+039D><U+0397> ",
"<U+039C><U+0395>S<U+039F><U+039B><U+039F>GG<U+0399> ",
"<U+039C><U+0397><U+039B><U+039F>S_<U+0391><U+039C>S ",
"<U+039C><U+03A5><U+039A><U+039F><U+039D><U+039F>S ",
"<U+039C><U+03A5><U+03A4><U+0399><U+039B><U+0397><U+039D><U+0397> ",
"<U+039D><U+0391><U+039E><U+039F>S ",
"<U+039D><U+0391><U+03A5><U+03A0><U+0391><U+039A><U+03A4><U+039F>S ",
"<U+039D><U+0391><U+03A5><U+03A0><U+039B><U+0399><U+039F> ",
"<U+039D><U+0395><U+0391> F<U+0399><U+039B><U+0391><U+0394><U+0395><U+039B>F<U+0395><U+0399><U+0391> ",
"<U+039E><U+0391><U+039D>T<U+0397> ",
"<U+039F><U+03A1><U+0395>S<U+03A4><U+0399><U+0391><U+0394><U+0391> ",
"<U+03A0><U+0391><U+0399><U+0391><U+039D><U+0399><U+0391> ",
"<U+03A0><U+0391><U+039B><U+0391><U+0399><U+039F><U+03A7>O<U+03A1><U+0391> ",
"<U+03A0><U+0391><U+03A1><U+039F>S_<U+0391>/<U+0394> ",
"<U+03A0><U+0391><U+03A4><U+03A1><U+0391> ",
"<U+03A0><U+0395><U+0399><U+03A1><U+0391><U+0399><U+0391>S ",
"<U+03A0><U+039F><U+039B><U+03A5>G<U+03A5><U+03A1><U+039F>S ",
"<U+03A0><U+039F><U+03A4><U+0399><U+0394><U+0391><U+0399><U+0391> ",
"<U+03A0><U+03A4><U+039F><U+039B><U+0395><U+039C><U+0391><U+0399><U+0394><U+0391> ",
"<U+03A0><U+03A5><U+03A1>G<U+039F>S ",
"<U+03A1><U+0391>F<U+0397><U+039D><U+0391> ",
"<U+03A1><U+0395>T<U+03A5><U+039C><U+039D><U+039F> ",
"<U+03A1><U+039F><U+0394><U+039F>S ",
"S<U+0391><U+039C><U+039F>S ",
"S<U+0395><U+0394><U+0395>S ",
"S<U+0395><U+03A1><U+03A1><U+0395>S ",
"S<U+0397><U+03A4><U+0395><U+0399><U+0391> ",
"S<U+039A><U+0399><U+0391>T<U+039F>S ",
"S<U+039A><U+039F><U+03A4><U+0399><U+039D><U+0391> ",
"S<U+039A><U+03A5><U+03A1><U+039F>S ",
"S<U+039F><U+03A5><U+0394><U+0391> ",
"S<U+039F><U+03A5>F<U+039B><U+0399> ",
"S<U+03A0><U+0391><U+03A1><U+03A4><U+0397> ",
"S<U+03A0><U+0391><U+03A4><U+0391>(<U+0392><U+0395><U+039D><U+0399><U+0396><U+0395><U+039B><U+039F>S) ",
"S<U+03A0><U+0395><U+03A4>S<U+0395>S ",
"S<U+03A4><U+0395>F<U+0391><U+039D><U+0399> (<U+039A><U+039F><U+03A1><U+0399><U+039D>T<U+0399><U+0391>S) ",
"S<U+03A5><U+039A><U+03A5>O<U+039D><U+0391> ",
"S<U+03A5><U+03A1><U+039F>S_<U+0391>/<U+0394> ",
"<U+03A4><U+0391><U+039D><U+0391>G<U+03A1><U+0391> ",
"<U+03A4><U+0391><U+03A4><U+039F><U+0399> (<U+0394><U+0395><U+039A><U+0395><U+039B><U+0395><U+0399><U+0391>) ",
"<U+03A4><U+0396><U+0395><U+03A1><U+039C><U+0399><U+0391><U+0394><U+0395>S ",
"<U+03A4><U+03A1><U+0399><U+039A><U+0391><U+039B><U+0391> <U+0397><U+039C><U+0391>T<U+0395><U+0399><U+0391>S ",
"<U+03A4><U+03A1><U+0399><U+039A><U+0391><U+039B><U+0391> T<U+0395>SS<U+0391><U+039B><U+0399><U+0391>S ",
"<U+03A4><U+03A1><U+0399><U+03A0><U+039F><U+039B><U+0397> ",
"<U+03A4><U+03A5><U+039C><U+03A0><U+0391><U+039A><U+0399> ",
"<U+03A4><U+03A5><U+03A1><U+0399><U+039D>T<U+0391> ",
"F<U+0391><U+03A1>S<U+0391><U+039B><U+0391> ",
"F<U+039B>O<U+03A1><U+0399><U+039D><U+0391> ",
"F<U+039F><U+03A5><U+03A1><U+039D><U+0397> ",
"F<U+03A5><U+03A7><U+03A4><U+0399><U+0391> ",
"<U+03A7><U+0391><U+039B><U+039A><U+0399><U+0394><U+0391> ",
"<U+03A7><U+0391><U+039D><U+0399><U+0391> ",
"<U+03A7><U+0399><U+039F>S ",
"<U+03A7><U+03A1><U+03A5>S<U+039F><U+03A5><U+03A0><U+039F><U+039B><U+0397>_<U+039A><U+0391><U+0392><U+0391><U+039B><U+0391> ",
"O<U+03A1><U+0395><U+039F><U+0399> "
)
Run Code Online (Sandbox Code Playgroud)
你在那里看起来像普通的7位ASCII字符,有些尝试通过包装其中的一些来编码Unicode代码点:<U+abcd>.
据我所知,这不是Unicode的公认编码,部分原因是你如何<在文本中添加真实内容?我想每个人<都可以<U+jklm>在jklm一个尖括号的代码......但是ick.
所以,首先,尝试从生成此ascii编码的混乱中获取UTF-8编码的字符串!
然而...经过一些严肃的头发拉动......
stringi救援!'MTST'是你的东西向量,首先将尖括号表示法转换为反斜杠-u然后使用stri_unescape_unicode:
> require(stringi)
> greek2=gsub(">","", gsub("<U\\+","\\\\u",MTST))
> stri_unescape_unicode(greek2)
[1] "?G????? "
[2] "?G?????S "
[3] "??G??? "
[4] "??G?? "
[5] "??????S "
[6] "?????(???????) "
Run Code Online (Sandbox Code Playgroud)
一路走来
[123] "F????? "
[124] "??????? "
[125] "????? "
[126] "???S "
[127] "???S??????_?????? "
[128] "O???? "
Run Code Online (Sandbox Code Playgroud)
一旦我在你的"dput"数据中修复了奇怪的缺失逗号和引号(为你编辑了你的问题).
| 归档时间: |
|
| 查看次数: |
872 次 |
| 最近记录: |