如何计算R中的nchar?

use*_*675 0 r char

我尝试以下代码

j <- "*Politics:* Disgraced peer Jeffrey Archer is set to make \xa31m from his Belmarsh "
nchar(j)
# Error in nchar(j) : invalid multibyte string 1
Run Code Online (Sandbox Code Playgroud)

正如你所看到的,我无法使用nchar().我该如何解决这个问题?

Das*_*son 7

如果您知道可以使用的特定编码 iconv来转换为更好的工作

j <- "*Politics:* Disgraced peer Jeffrey Archer is set to make \xa31m from his Belmarsh "
iconv(j, "ISO-8859-1", "UTF-8")
#[1] "*Politics:* Disgraced peer Jeffrey Archer is set to make £1m from his Belmarsh "
nchar(iconv(j, "ISO-8859-1", "UTF-8"))
#[1] 79
Run Code Online (Sandbox Code Playgroud)

我将您的文本写入文件并使用geany检查编码,这是我到达ISO-8859-1的方式.

不需要您计算编码的替代路线是使用type="bytes"而不是手动转换为UTF-8

nchar(j, type = "bytes")
#[1] 79
Run Code Online (Sandbox Code Playgroud)

我建议在nchar上读取帮助文件,?nchar因为默认类型和type ="bytes"之间存在细微差别.