std :: ctype总是按"C"语言环境对字符进行分类吗？

Question

std :: ctype总是按"C"语言环境对字符进行分类吗？

tem*_*boy 5 c++ unicode locale character-encoding

cppreference说std::ctype基于经典的"C"语言环境提供了字符分类.当我们创建这样的语言环境时,这是否正确:

std::locale loc(std::locale("en_US.UTF8"), new std::ctype<char>);

Run Code Online (Sandbox Code Playgroud)

loc仍然会根据"C"语言环境或Unicode语言对字符进行分类吗？如果按前者分类,为什么我们甚至将语言环境名称指定为"en_US.UTF8"？

Answer 1

Cub*_*bbi 2

该标准要求默认构造 std::ctype<char>通过以下方式匹配最小的“C”语言环境\xc2\xa722.4.1.3.3[facet.ctype.char.statics]/1

\n

\n
static const mask* classic_table() noexcept;
\n
返回：指向大小数组的初始元素的指针，table_size该数组表示“C”语言环境中字符的分类
\n

\n

分类成员函数is()是根据定义的table()除非向的构造classic_table()函数提供另一个表ctype<char>

\n

我已经更新了 cppreference 以更正确地匹配这些要求（它也说“C” std::ctype<wchar_t>）

\n

为了回答你的第二个问题，使用构建的语言环境std::locale loc(std::locale("en_US.UTF8"), new std::ctype<char>);将使用你指定的 ctype 方面（因此，“C”）来对窄字符进行分类，但它是多余的：普通的窄字符分类std::locale("en_US.UTF8")（至少在 GNU 实现中）恰好是相同：

\n

#include <iostream>\n#include <cassert>\n#include <locale>\nint main()\n{\n\n    std::locale loc1("en_US.UTF8");\n    const std::ctype_base::mask* tbl1 =\n         std::use_facet<std::ctype<char>>(loc1).table();\n\n    std::locale loc2(std::locale("en_US.UTF8"), new std::ctype<char>);\n    const std::ctype_base::mask* tbl2 =\n         std::use_facet<std::ctype<char>>(loc2).table();\n\n    for(size_t n = 0; n < 256; ++n)\n        assert(tbl1[n] == tbl2[n]);\n}\n

Run Code Online (Sandbox Code Playgroud)\n

归档时间：	12 年，11 月前
查看次数：	807 次
最近记录：	12 年，11 月前