C++中的Unicode到UTF-8

Ezr*_*zra 5 c++ unicode boost utf-8

我搜索了很多,但找不到任何东西:

unsigned int unicodeChar = 0x5e9;
unsigned int utf8Char;
uni2utf8(unicodeChar, utf8Char);
assert(utf8Char == 0xd7a9);
Run Code Online (Sandbox Code Playgroud)

是否有一个库(最好是boost)实现类似于uni2utf8的东西?

Phi*_*ipp 14

Unicode转换是C++ 11的一部分:

#include <codecvt>
#include <locale>
#include <string>
#include <cassert>

int main() {
  std::wstring_convert<std::codecvt_utf8<char32_t>, char32_t> convert;
  std::string utf8 = convert.to_bytes(0x5e9);
  assert(utf8.length() == 2);
  assert(utf8[0] == '\xD7');
  assert(utf8[1] == '\xA9');
}
Run Code Online (Sandbox Code Playgroud)

  • 你不需要codecvt_utf8.`codecvt <char32_t,char,std :: mbstate>`在UTF-32和UTF-8之间转换,`codecvt <char16_t,char,std :: mbstate>`在UTF-16和UTF-8之间转换. (3认同)

Phi*_*ipp 10

Boost.Locale还具有编码转换的功能:

#include <boost/locale.hpp>

int main() {
  unsigned int point = 0x5e9;
  std::string utf8 = boost::locale::conv::utf_to_utf<char>(&point, &point + 1);
  assert(utf8.length() == 2);
  assert(utf8[0] == '\xD7');
  assert(utf8[1] == '\xA9');
}
Run Code Online (Sandbox Code Playgroud)