Ruby 中的数据转换

use*_*191 3 ruby ruby-on-rails

我在进行这些转换时遇到问题:

string = "test \\ud83d\\ude01" #into '1f601' and vise versa.

unicode_value = 'U+1F601' #into string '\\ud83d\\ude01'
Run Code Online (Sandbox Code Playgroud)

我尝试过这种方法来编码

string.encode('utf-8') #output is "test \\ud83d\\ude01"
Run Code Online (Sandbox Code Playgroud)

也尝试过这个

string.force_encoding('utf-8')  #output is "test \\ud83d\\ude01"
Run Code Online (Sandbox Code Playgroud)

谢谢

Eri*_*nil 5

十六进制到 Unicode 字符

\n\n

“\\ud83d\\ude01”到笑脸

\n\n

按照这个来看,"\\\\ud83d\\\\ude01"好像UTF-16 (hex)。请注意,它是一个标准 ASCII 字符串:["\\\\", "u", "d", "8", "3", "d", "\\\\", "u", "d", "e", "0", "1"]

\n\n
str = "\\\\ud83d\\\\ude01"\nhex = str.gsub("\\\\u",\'\')\n\nsmiley = [hex].pack(\'H*\').force_encoding(\'utf-16be\').encode(\'utf-8\')\nputs smiley\n#=> \n
Run Code Online (Sandbox Code Playgroud)\n\n

将“U+1F601”改为笑脸

\n\n

这看起来像十六进制的“UTF-8”字符。请注意,这"U+1F601"也是一个标准 ASCII 字符串:["U", "+", "1", "F", "6", "0", "1"]

\n\n
unicode_value = \'U+1F601\'\nhex = unicode_value.sub(\'U+\',\'\')\nsmiley = hex.to_i(16).chr(\'UTF-8\')\nputs smiley\n#=> \n
Run Code Online (Sandbox Code Playgroud)\n\n

UTF-8 十六进制 \xe2\x9f\xb7 UTF-16 十六进制

\n\n

结合上面两种方法:

\n\n

“\\ud83d\\ude01”到 \'U+1F601\'

\n\n
str = "\\\\ud83d\\\\ude01"\nutf16_hex = str.gsub("\\\\u",\'\')\nsmiley = [utf16_hex].pack(\'H*\').force_encoding(\'utf-16be\').encode(\'utf-8\')\nutf8_hex = smiley.ord.to_s(16).upcase\nnew_str = "U+#{utf8_hex}"\nputs new_str\n#=> "U+1F601"\n
Run Code Online (Sandbox Code Playgroud)\n\n

\'U+1F601\' 到 "\\ud83d\\ude01"

\n\n
unicode_value = \'U+1F601\'\nhex = unicode_value.sub(\'U+\',\'\')\nsmiley = hex.to_i(16).chr(\'UTF-8\')\nputs smiley.force_encoding(\'utf-8\').encode(\'utf-16be\').unpack(\'H*\').first.gsub(/(....)/,\'\\u\\1\')\n#=> "\\ud83d\\ude01"\n
Run Code Online (Sandbox Code Playgroud)\n\n

可能有一种更简单的方法可以做到这一点,但我找不到。

\n\n

使用此代码

\n\n
def utf16_hex_to_unicode_char(utf16_hex)\n  hex = utf16_hex.gsub("\\\\u",\'\')\n  [hex].pack(\'H*\').force_encoding(\'utf-16be\').encode(\'utf-8\')\nend\n\ndef replace_all_utf16_hex(string)\n  string.gsub(/(\\\\u[0-9a-fA-F]{4}){2}/){|hex| utf16_hex_to_unicode_char(hex)}\nend\n\nputs replace_all_utf16_hex("Hello \\\\ud83d\\\\ude01, I just bought a \\\\uD83D\\\\uDC39")\n#=> "Hello , I just bought a "\n
Run Code Online (Sandbox Code Playgroud)\n