Sve*_*rre 19 windows unicode r utf-8
虽然R似乎在内部很好地处理Unicode字符,但是我无法在R中输出具有这种UTF-8 Unicode字符的数据帧.有没有办法强迫这个?
data.frame(c("h?ersumian","?mettigan"))->test
write.table(test,"test.txt",row.names=F,col.names=F,quote=F,fileEncoding="UTF-8")
Run Code Online (Sandbox Code Playgroud)
输出文本文件如下:
hiersumian <U+01E3>mettigan
我在Windows环境(Windows 7)中使用R 3.0.2版.
编辑
在答案中已经建议R正确地以UTF-8编写文件,问题在于我用来查看文件的软件.这里有一些代码,我在R中做所有事情.我正在用UTF-8编码的文本文件中读取,并且R正确读取它.然后R将文件写入UTF-8并再次读回,现在正确的Unicode字符消失了.
read.table("myinputfile.txt",encoding="UTF-8")->myinputfile
myinputfile[1,1]
write.table(myinputfile,"myoutputfile.txt",row.names=F,col.names=F,quote=F,fileEncoding="UTF-8")
read.table("myoutputfile.txt",encoding="UTF-8")->myoutputfile
myoutputfile[1,1]
Run Code Online (Sandbox Code Playgroud)
控制台输出:
> read.table("myinputfile.txt",encoding="UTF-8")->myinputfile
> myinputfile[1,1]
[1] h?ersumian
Levels: h?ersumian ?mettigan
> write.table(myinputfile,"myoutputfile.txt",row.names=F,col.names=F,quote=F,fileEncoding="UTF-8")
> read.table("myoutputfile.txt",encoding="UTF-8")->myoutputfile
> myoutputfile[1,1]
[1] <U+FEFF>hiersumian
Levels: <U+01E3>mettigan <U+FEFF>hiersumian
>
Run Code Online (Sandbox Code Playgroud)
Raf*_*ael 10
这个"答案"的目的是澄清幕后有些奇怪的事情:
"hīersumian"甚至没有把它变成数据框架.在所有情况下,"ī" - 符号都转换为"i".
options("encoding" = "native.enc")
t1 <- data.frame(a = c("h?ersumian "), stringsAsFactors=F)
t1
# a
# 1 hiersumian
options("encoding" = "UTF-8")
t1 <- data.frame(a = c("h?ersumian "), stringsAsFactors=F)
t1
# a
# 1 hiersumian
options("encoding" = "UTF-16")
t1 <- data.frame(a = c("h?ersumian "), stringsAsFactors=F)
t1
# a
# 1 hiersumian
Run Code Online (Sandbox Code Playgroud)
以下序列成功将"ǣmettigan"写入文本文件:
t2 <- data.frame(a = c("?mettigan"), stringsAsFactors=F)
getOption("encoding")
# [1] "native.enc"
Encoding(t2[,"a"]) <- "UTF-16"
write.table(t2,"test.txt",row.names=F,col.names=F,quote=F)
Run Code Online (Sandbox Code Playgroud)
它不能用"编码"作为"UTF-8"或"UTF-16",并且指定"fileEncoding"将导致缺陷或没有输出.
有点令人失望,到目前为止,我设法以某种方式修复所有Unicode问题.
归档时间: |
|
查看次数: |
15553 次 |
最近记录: |