如何阅读.带有"希伯来语"列名(在R中)?

Tal*_*ili 3 r utf-8 hebrew

我正在尝试使用希伯来语列名读取.txt文件,但没有成功.

我上传了一个示例文件到:http: //www.talgalili.com/files/aa.txt

我正在尝试这个命令:

read.table("http://www.talgalili.com/files/aa.txt", header = T, sep = "\t")
Run Code Online (Sandbox Code Playgroud)

这让我回头:

  X.....ª X...ª...... X...œ....
1      12          97         6
2     123         354        44
3       6           1         3
Run Code Online (Sandbox Code Playgroud)

代替:

??? ?????   ????
12  97  6
123 354 44
6   1   3
Run Code Online (Sandbox Code Playgroud)

我的输出:

l10n_info()
Run Code Online (Sandbox Code Playgroud)

方法是:

$MBCS
[1] FALSE

$`UTF-8`
[1] FALSE

$`Latin-1`
[1] TRUE

$codepage
[1] 1252
Run Code Online (Sandbox Code Playgroud)

并为:

Sys.getlocale()
Run Code Online (Sandbox Code Playgroud)

方法是:

[1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
Run Code Online (Sandbox Code Playgroud)

你能告诉我要尝试改变什么以允许我正确加载文件吗?

更新: 尝试使用:

read.table("http://www.talgalili.com/files/aa.txt",fileEncoding ="iso8859-8")
Run Code Online (Sandbox Code Playgroud)

导致:

 V1
1  ?
Warning messages:
1: In read.table("http://www.talgalili.com/files/aa.txt", fileEncoding = "iso8859-8") :
  invalid input found on input connection 'http://www.talgalili.com/files/aa.txt'
2: In read.table("http://www.talgalili.com/files/aa.txt", fileEncoding = "iso8859-8") :
  incomplete final line found by readTableHeader on 'http://www.talgalili.com/files/aa.txt'
Run Code Online (Sandbox Code Playgroud)

同时也尝试这个:

Sys.setlocale("LC_ALL", "en_US.UTF-8")
Run Code Online (Sandbox Code Playgroud)

或这个:

Sys.setlocale("LC_ALL", "en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8")
Run Code Online (Sandbox Code Playgroud)

得到我这个:

[1] ""
Warning message:
In Sys.setlocale("LC_ALL", "en_US.UTF-8") :
  OS reports request to set locale to "en_US.UTF-8" cannot be honored
Run Code Online (Sandbox Code Playgroud)

最后,这是> sessionInfo()

R version 2.10.1 (2009-12-14) 
i386-pc-mingw32 

locale:
[1] LC_COLLATE=English_United States.1255  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_2.10.1
Run Code Online (Sandbox Code Playgroud)

任何建议或澄清将不胜感激.

最好的,塔尔

Geo*_*tas 5

我会尝试将参数传递fileEncoding给read.table,其值为iso8859-8.

使用iconvlist()得到支持的编码的字母列表.正如我在这里看到的那样,希伯来语必须是ISO 8859的第8部分.