Ali*_*xel 21

header('Content-Encoding: UTF-8');

function mb_html_entity_decode($string)
{
    if (extension_loaded('mbstring') === true)
    {
        mb_language('Neutral');
        mb_internal_encoding('UTF-8');
        mb_detect_order(array('UTF-8', 'ISO-8859-15', 'ISO-8859-1', 'ASCII'));

        return mb_convert_encoding($string, 'UTF-8', 'HTML-ENTITIES');
    }

    return html_entity_decode($string, ENT_COMPAT, 'UTF-8');
}

function mb_ord($string)
{
    if (extension_loaded('mbstring') === true)
    {
        mb_language('Neutral');
        mb_internal_encoding('UTF-8');
        mb_detect_order(array('UTF-8', 'ISO-8859-15', 'ISO-8859-1', 'ASCII'));

        $result = unpack('N', mb_convert_encoding($string, 'UCS-4BE', 'UTF-8'));

        if (is_array($result) === true)
        {
            return $result[1];
        }
    }

    return ord($string);
}

function mb_chr($string)
{
    return mb_html_entity_decode('&#' . intval($string) . ';');
}

var_dump(hexdec('010F'));

var_dump(mb_ord('ó')); // 243
var_dump(mb_chr(243)); // ó
Run Code Online (Sandbox Code Playgroud)

  • 使用此代码并升级到 PHP 7.2 mbstring 的任何人都需要记住将这些函数包装在 `if (!function_exists('mb_ord')` 和 `if (!function_exists('mb_chr')` 中,或者将它们完全删除为它们现在被添加到 7.2 mbstring (2认同)

Joh*_*ers 5

我只是写了一个polyfill缺少多字节版本的ordchr考虑到以下几点:

  • 它定义函数,mb_ord并且mb_chr仅当它们不存在时才定义。如果它们确实存在于您的框架或 PHP 的某个未来版本中,则 polyfill 将被忽略。

  • 它使用广泛使用的mbstring扩展来进行转换。如果mbstring未加载扩展名,它将改用iconv扩展名。


编辑 :

我添加了 HTMLentities 编码/解码和编码/解码为 JSON 格式的函数以及一些关于如何使用这些函数的演示代码


代码

if (!function_exists('codepoint_encode')) {
    function codepoint_encode($str) {
        return substr(json_encode($str), 1, -1);
    }
}

if (!function_exists('codepoint_decode')) {
    function codepoint_decode($str) {
        return json_decode(sprintf('"%s"', $str));
    }
}

if (!function_exists('mb_internal_encoding')) {
    function mb_internal_encoding($encoding = NULL) {
        return ($from_encoding === NULL) ? iconv_get_encoding() : iconv_set_encoding($encoding);
    }
}

if (!function_exists('mb_convert_encoding')) {
    function mb_convert_encoding($str, $to_encoding, $from_encoding = NULL) {
        return iconv(($from_encoding === NULL) ? mb_internal_encoding() : $from_encoding, $to_encoding, $str);
    }
}

if (!function_exists('mb_chr')) {
    function mb_chr($ord, $encoding = 'UTF-8') {
        if ($encoding === 'UCS-4BE') {
            return pack("N", $ord);
        } else {
            return mb_convert_encoding(mb_chr($ord, 'UCS-4BE'), $encoding, 'UCS-4BE');
        }
    }
}

if (!function_exists('mb_ord')) {
    function mb_ord($char, $encoding = 'UTF-8') {
        if ($encoding === 'UCS-4BE') {
            list(, $ord) = (strlen($char) === 4) ? @unpack('N', $char) : @unpack('n', $char);
            return $ord;
        } else {
            return mb_ord(mb_convert_encoding($char, 'UCS-4BE', $encoding), 'UCS-4BE');
        }
    }
}

if (!function_exists('mb_htmlentities')) {
    function mb_htmlentities($string, $hex = true, $encoding = 'UTF-8') {
        return preg_replace_callback('/[\x{80}-\x{10FFFF}]/u', function ($match) use ($hex) {
            return sprintf($hex ? '&#x%X;' : '&#%d;', mb_ord($match[0]));
        }, $string);
    }
}

if (!function_exists('mb_html_entity_decode')) {
    function mb_html_entity_decode($string, $flags = null, $encoding = 'UTF-8') {
        return html_entity_decode($string, ($flags === NULL) ? ENT_COMPAT | ENT_HTML401 : $flags, $encoding);
    }
}
Run Code Online (Sandbox Code Playgroud)

如何使用

echo "Get string from numeric DEC value\n";
var_dump(mb_chr(50319, 'UCS-4BE'));
var_dump(mb_chr(271));

echo "\nGet string from numeric HEX value\n";
var_dump(mb_chr(0xC48F, 'UCS-4BE'));
var_dump(mb_chr(0x010F));

echo "\nGet numeric value of character as DEC string\n";
var_dump(mb_ord('?', 'UCS-4BE'));
var_dump(mb_ord('?'));

echo "\nGet numeric value of character as HEX string\n";
var_dump(dechex(mb_ord('?', 'UCS-4BE')));
var_dump(dechex(mb_ord('?')));

echo "\nEncode / decode to DEC based HTML entities\n";
var_dump(mb_htmlentities('tchüß', false));
var_dump(mb_html_entity_decode('tchüß'));

echo "\nEncode / decode to HEX based HTML entities\n";
var_dump(mb_htmlentities('tchüß'));
var_dump(mb_html_entity_decode('tchüß'));

echo "\nUse JSON encoding / decoding\n";
var_dump(codepoint_encode("tchüß"));
var_dump(codepoint_decode('tch\u00fc\u00df'));
Run Code Online (Sandbox Code Playgroud)

输出

Get string from numeric DEC value
string(4) "?"
string(2) "?"

Get string from numeric HEX value
string(4) "?"
string(2) "?"

Get numeric value of character as DEC int
int(50319)
int(271)

Get numeric value of character as HEX string
string(4) "c48f"
string(3) "10f"

Encode / decode to DEC based HTML entities
string(15) "tchüß"
string(7) "tchüß"

Encode / decode to HEX based HTML entities
string(15) "tchüß"
string(7) "tchüß"

Use JSON encoding / decoding
string(15) "tch\u00fc\u00df"
string(7) "tchüß"
Run Code Online (Sandbox Code Playgroud)