Chr*_*oft 87
正则表达式替换将是最佳选择.使用$str作为示例字符串并使用:print:它匹配,这是一个POSIX字符类:
$str = 'aAÂ';
$str = preg_replace('/[[:^print:]]/', '', $str); // should be aA
Run Code Online (Sandbox Code Playgroud)
什么:print:是寻找所有可打印的字符.反之,:^print:查找所有不可打印的字符.将删除不属于当前字符集的任何字符.
注意:在使用此方法之前,必须确保当前字符集是ASCII.POSIX字符类支持ASCII和Unicode,并且仅根据当前字符集进行匹配.从PHP 5.6开始,默认字符集为UTF-8.
Dam*_*irR 39
您只想要ASCII可打印字符吗?
用这个:
<?php
header('Content-Type: text/html; charset=UTF-8');
$str = "abqwreš??žsff";
$res = preg_replace('/[^\x20-\x7E]/','', $str);
echo "($str)($res)";
Run Code Online (Sandbox Code Playgroud)
或者甚至更好,将您的输入转换为utf8并使用phputf8 lib将"非正常"字符转换为其ascii表示:
require_once('libs/utf8/utf8.php');
require_once('libs/utf8/utils/bad.php');
require_once('libs/utf8/utils/validation.php');
require_once('libs/utf8_to_ascii/utf8_to_ascii.php');
if(!utf8_is_valid($str))
{
$str=utf8_bad_strip($str);
}
$str = utf8_to_ascii($str, '' );
Run Code Online (Sandbox Code Playgroud)
Sil*_*mer 20
有点相关,我们有一个Web应用程序,必须将数据发送到遗留系统,该系统只能处理ASCII字符集的前128个字符.
我们必须使用的解决方案是将尽可能多的字符"转换"为紧密匹配的ASCII等价物,但留下任何无法单独翻译的内容.
通常我会做这样的事情:
<?php
// transliterate
if (function_exists('iconv')) {
$text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);
}
?>
Run Code Online (Sandbox Code Playgroud)
...但是,它取代了无法翻译成问号的所有内容(?).
所以我们最终做了以下事情.在这个函数的末尾检查(注释掉)只删除非ASCII字符的php正则表达式.
<?php
public function cleanNonAsciiCharactersInString($orig_text) {
$text = $orig_text;
// Single letters
$text = preg_replace("/[????áàâãªä]/u", "a", $text);
$text = preg_replace("/[??????ÁÀÂÃÄ]/u", "A", $text);
$text = preg_replace("/[??????]/u", "b", $text);
$text = preg_replace("/[???]/u", "B", $text);
$text = preg_replace("/[ç?©?]/u", "c", $text);
$text = preg_replace("/[Ç?]/u", "C", $text);
$text = preg_replace("/[??]/u", "d", $text);
$text = preg_replace("/[éèêë?ëè????????]/u", "e", $text);
$text = preg_replace("/[ÉÈÊË€??€??]/u", "E", $text);
$text = preg_replace("/[?]/u", "F", $text);
$text = preg_replace("/[????]/u", "H", $text);
$text = preg_replace("/[???]/u", "h", $text);
$text = preg_replace("/[ÍÌÎÏ]/u", "I", $text);
$text = preg_replace("/[íìîï????]/u", "i", $text);
$text = preg_replace("/[??]/u", "j", $text);
$text = preg_replace("/[???]/u", 'K', $text);
$text = preg_replace("/[??]/u", 'k', $text);
$text = preg_replace("/[??]/u", 'l', $text);
$text = preg_replace("/[??]/u", "M", $text);
$text = preg_replace("/[ñ?????]/u", "n", $text);
$text = preg_replace("/[Ñ?????????]/u", "N", $text);
$text = preg_replace("/[óòôõºö???????]/u", "o", $text);
$text = preg_replace("/[ÓÒÔÕÖ?????]/u", "O", $text);
$text = preg_replace("/[?????]/u", "p", $text);
$text = preg_replace("/[®??]/u", "R", $text);
$text = preg_replace("/[????]/u", "r", $text);
$text = preg_replace("/[?]/u", "S", $text);
$text = preg_replace("/[?]/u", "s", $text);
$text = preg_replace("/[??]/u", "T", $text);
$text = preg_replace("/[?†‡]/u", "t", $text);
$text = preg_replace("/[úùûü???µ???]/u", "u", $text);
$text = preg_replace("/[?]/u", "v", $text);
$text = preg_replace("/[ÚÙÛÜ???]/u", "U", $text);
$text = preg_replace("/[??????????]/u", "w", $text);
$text = preg_replace("/[?????]/u", "W", $text);
$text = preg_replace("/[?????]/u", "x", $text);
$text = preg_replace("/[??¥]/u", "Y", $text);
$text = preg_replace("/[???????]/u", "y", $text);
$text = preg_replace("/[?]/u", "Z", $text);
// Punctuation
$text = preg_replace("/[‚‚??]/u", ",", $text);
$text = preg_replace("/[`??’‘]/u", "'", $text);
$text = preg_replace("/[?“”«»„]/u", '"', $text);
$text = preg_replace("/[—–??–??????]/u", '-', $text);
$text = preg_replace("/[ ]/u", ' ', $text);
$text = str_replace("…", "...", $text);
$text = str_replace("?", "!=", $text);
$text = str_replace("?", "<=", $text);
$text = str_replace("?", ">=", $text);
$text = preg_replace("/[???]/u", "=", $text);
// Exciting combinations
$text = str_replace("??", "bl", $text);
$text = str_replace("?", "c/o", $text);
$text = str_replace("?", "Pts", $text);
$text = str_replace("™", "tm", $text);
$text = str_replace("?", "No", $text);
$text = str_replace("?", "4", $text);
$text = str_replace("‰", "%", $text);
$text = preg_replace("/[?•]/u", "*", $text);
$text = str_replace("‹", "<", $text);
$text = str_replace("›", ">", $text);
$text = str_replace("?", "!!", $text);
$text = str_replace("?", "/", $text);
$text = str_replace("?", "/", $text);
$text = str_replace("?", "7/8", $text);
$text = str_replace("?", "5/8", $text);
$text = str_replace("?", "3/8", $text);
$text = str_replace("?", "1/8", $text);
$text = preg_replace("/[‰]/u", "%", $text);
$text = preg_replace("/[??]/u", "Ab", $text);
$text = preg_replace("/[??]/u", "IO", $text);
$text = preg_replace("/[????]/u", "fi", $text);
$text = preg_replace("/[??]/u", "3", $text);
$text = str_replace("£", "(pounds)", $text);
$text = str_replace("?", "(lira)", $text);
$text = preg_replace("/[‰]/u", "%", $text);
$text = preg_replace("/[?????]/u", "|", $text);
$text = preg_replace("/[??????]/u", "", $text);
//2) Translation CP1252.
$trans = get_html_translation_table(HTML_ENTITIES);
$trans['f'] = 'ƒ'; // Latin Small Letter F With Hook
$trans['-'] = array(
'…', // Horizontal Ellipsis
'˜', // Small Tilde
'–' // Dash
);
$trans["+"] = '†'; // Dagger
$trans['#'] = '‡'; // Double Dagger
$trans['M'] = '‰'; // Per Mille Sign
$trans['S'] = 'Š'; // Latin Capital Letter S With Caron
$trans['OE'] = 'Œ'; // Latin Capital Ligature OE
$trans["'"] = array(
'‘', // Left Single Quotation Mark
'’', // Right Single Quotation Mark
'›', // Single Right-Pointing Angle Quotation Mark
'‚', // Single Low-9 Quotation Mark
'ˆ', // Modifier Letter Circumflex Accent
'‹' // Single Left-Pointing Angle Quotation Mark
);
$trans['"'] = array(
'“', // Left Double Quotation Mark
'”', // Right Double Quotation Mark
'„', // Double Low-9 Quotation Mark
);
$trans['*'] = '•'; // Bullet
$trans['n'] = '–'; // En Dash
$trans['m'] = '—'; // Em Dash
$trans['tm'] = '™'; // Trade Mark Sign
$trans['s'] = 'š'; // Latin Small Letter S With Caron
$trans['oe'] = 'œ'; // Latin Small Ligature OE
$trans['Y'] = 'Ÿ'; // Latin Capital Letter Y With Diaeresis
$trans['euro'] = '€'; // euro currency symbol
ksort($trans);
foreach ($trans as $k => $v) {
$text = str_replace($v, $k, $text);
}
// 3) remove <p>, <br/> ...
$text = strip_tags($text);
// 4) & => & " => '
$text = html_entity_decode($text);
// transliterate
// if (function_exists('iconv')) {
// $text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);
// }
// remove non ascii characters
// $text = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $text);
return $text;
}
?>
Run Code Online (Sandbox Code Playgroud)