在PHP中将utf8转换为latin1.255以上的所有字符都转换为char引用

Mik*_*rov 8 php character-encoding

我需要将UTF-8中的文本转换为ISO-8859-1中编码的文本,这样任何不属于ISO-8859-1集的字符都会变成字符引用.(ex β)

示例:我想将文字转换为

hello é ? ?
Run Code Online (Sandbox Code Playgroud)

hello é β 水
Run Code Online (Sandbox Code Playgroud)

我在PHP中做这一切.我尝试了内置函数,iconv,整洁和组合,仍然无法获得可靠的解决方案.

这是我到目前为止所拥有的

// convert any characters fount in the entity table into HTML entities
// do not double encode entities, do not mess with quotes
// use UTF-8 as character encoding because the page submits UTF-8
$str = htmlentities($str,ENT_NOQUOTES,'UTF-8',false);
//print $str."\n";

// convert text from UTF-8 to ISO-8859-1, 
// characters that cannot be converted will be converted to ?
$str = utf8_decode($str);
//print $str."\n";    

// make string XML valid.
// mainly it converts text entities into numeric entities.
$opts = array(  "output-xhtml"      => true, 
            "output-xml"        => true, 
            "show-body-only"    => true,
            "numeric-entities"  => true,
            "wrap"              => 0,
            "indent"            => false,
            "char-encoding" => 'latin1'
        );
$tidy = tidy_parse_string($str, $opts,'latin1');
tidy_clean_repair($tidy);
$str = tidy_get_output($tidy);      
//print $str."\n";
Run Code Online (Sandbox Code Playgroud)

bob*_*nce 11

您需要多字节支持.特别是,mb_encode_numericentity():

$convmap= array(0x0100, 0xFFFF, 0, 0xFFFF);
$encutf= mb_encode_numericentity($utf, $convmap, 'UTF-8');
$iso= utf8_decode($encutf);
Run Code Online (Sandbox Code Playgroud)

(这不碰<,&,"等等,所以你可能还需要htmlspecialchars()事先).