And*_* SK 149 php internationalization slug
__PRE__
工作得很好,但我发现了一些失败的案例:
gen_slug('Andrés Cortez')返回andres-cortez而不是gen_slug('Andrés Cortez')
为什么?关于andres-cortez参数的任何想法?
Mae*_*lyn 408
尝试这个,而不是冗长的替换:
public static function slugify($text)
{
// replace non letter or digits by -
$text = preg_replace('~[^\pL\d]+~u', '-', $text);
// transliterate
$text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);
// remove unwanted characters
$text = preg_replace('~[^-\w]+~', '', $text);
// trim
$text = trim($text, '-');
// remove duplicate -
$text = preg_replace('~-+~', '-', $text);
// lowercase
$text = strtolower($text);
if (empty($text)) {
return 'n-a';
}
return $text;
}
Run Code Online (Sandbox Code Playgroud)
这是基于Symfony的Jobeet教程中的一个.
The*_*pit 43
由于这个答案得到了一些关注,我正在补充一些解释.
提供的解决方案基本上用 - (连字符)替换除AZ,az,0-9和 - (连字符)之外的所有内容.因此,它将无法与其他unicode字符(URL slug/string的有效字符)一起正常工作.常见的情况是输入字符串包含非英语字符.
如果您确信输入字符串不具有您可能希望成为输出/ slug一部分的unicode字符,请仅使用此解决方案.
例如."नारीशक्ति"将变为"----------"(所有连字符)而不是"नारी-शक्ति"(有效的URL slug).
怎么样...
$slug = strtolower(trim(preg_replace('/[^A-Za-z0-9-]+/', '-', $string)));
Run Code Online (Sandbox Code Playgroud)
?
小智 35
如果安装了intl扩展,则可以使用transliterator_transliterate函数轻松创建slug.
您可以稍后用短划线替换空格,使其更像slug.
<?php
$string = 'Namnet på bildtävlingen';
$slug = \Transliterator::createFromRules(
':: Any-Latin;'
. ':: NFD;'
. ':: [:Nonspacing Mark:] Remove;'
. ':: NFC;'
. ':: [:Punctuation:] Remove;'
. ':: Lower();'
. '[:Separator:] > \'-\''
)
->transliterate( $string );
echo $slug; // namnet-pa-bildtavlingen
?>
Run Code Online (Sandbox Code Playgroud)
Imr*_*hsh 23
注意:我从wordpress中取得了这个功能!!
像这样使用它:
echo sanitize('testing this link');
Run Code Online (Sandbox Code Playgroud)
码
//taken from wordpress
function utf8_uri_encode( $utf8_string, $length = 0 ) {
$unicode = '';
$values = array();
$num_octets = 1;
$unicode_length = 0;
$string_length = strlen( $utf8_string );
for ($i = 0; $i < $string_length; $i++ ) {
$value = ord( $utf8_string[ $i ] );
if ( $value < 128 ) {
if ( $length && ( $unicode_length >= $length ) )
break;
$unicode .= chr($value);
$unicode_length++;
} else {
if ( count( $values ) == 0 ) $num_octets = ( $value < 224 ) ? 2 : 3;
$values[] = $value;
if ( $length && ( $unicode_length + ($num_octets * 3) ) > $length )
break;
if ( count( $values ) == $num_octets ) {
if ($num_octets == 3) {
$unicode .= '%' . dechex($values[0]) . '%' . dechex($values[1]) . '%' . dechex($values[2]);
$unicode_length += 9;
} else {
$unicode .= '%' . dechex($values[0]) . '%' . dechex($values[1]);
$unicode_length += 6;
}
$values = array();
$num_octets = 1;
}
}
}
return $unicode;
}
//taken from wordpress
function seems_utf8($str) {
$length = strlen($str);
for ($i=0; $i < $length; $i++) {
$c = ord($str[$i]);
if ($c < 0x80) $n = 0; # 0bbbbbbb
elseif (($c & 0xE0) == 0xC0) $n=1; # 110bbbbb
elseif (($c & 0xF0) == 0xE0) $n=2; # 1110bbbb
elseif (($c & 0xF8) == 0xF0) $n=3; # 11110bbb
elseif (($c & 0xFC) == 0xF8) $n=4; # 111110bb
elseif (($c & 0xFE) == 0xFC) $n=5; # 1111110b
else return false; # Does not match any model
for ($j=0; $j<$n; $j++) { # n bytes matching 10bbbbbb follow ?
if ((++$i == $length) || ((ord($str[$i]) & 0xC0) != 0x80))
return false;
}
}
return true;
}
//function sanitize_title_with_dashes taken from wordpress
function sanitize($title) {
$title = strip_tags($title);
// Preserve escaped octets.
$title = preg_replace('|%([a-fA-F0-9][a-fA-F0-9])|', '---$1---', $title);
// Remove percent signs that are not part of an octet.
$title = str_replace('%', '', $title);
// Restore octets.
$title = preg_replace('|---([a-fA-F0-9][a-fA-F0-9])---|', '%$1', $title);
if (seems_utf8($title)) {
if (function_exists('mb_strtolower')) {
$title = mb_strtolower($title, 'UTF-8');
}
$title = utf8_uri_encode($title, 200);
}
$title = strtolower($title);
$title = preg_replace('/&.+?;/', '', $title); // kill entities
$title = str_replace('.', '-', $title);
$title = preg_replace('/[^%a-z0-9 _-]/', '', $title);
$title = preg_replace('/\s+/', '-', $title);
$title = preg_replace('|-+|', '-', $title);
$title = trim($title, '-');
return $title;
}
Run Code Online (Sandbox Code Playgroud)
使用许多高级开发人员支持的现有解决方案总是一个好主意。最受欢迎的一种是https://github.com/cocur/slugify。首先,它支持多种语言,并且正在更新中。
如果您不想使用整个程序包,则可以复制所需的部分。
这里已经有很多答案,所以我几乎不想添加另一个答案,但是没有一个函数可以完成我需要的一切。
\n对我来说最好的基础是第 3 号函数,其中比较了它们的速度。我添加/修复了一些替换项,所以
\n\'刚刚被删除,.被替换为-,\xce\xb1被替换为a,\xe1\xba\x9e被替换为b,\xc5\x81(和类似)被替换为L代替K和\xe2\x82\xac和$符号分别替换为eur和usd(必要时添加更多)。您可以选择添加\'&\' => \'-and-\',但 SEO 建议不要使用连词(#8),因此我将其保留在我的用例中。(不过,此函数不会从字符串中删除现有的ands 和s )or
我还添加了一行代码来修复我想出的这个奇怪字符串中的双破折号,以及一个可选参数来限制 slug 的长度。
\n<?php\nfunction slugify($text, $length = null)\n{\n $replacements = [\n \'<\' => \'\', \'>\' => \'\', \'-\' => \' \', \'&\' => \'\', \'"\' => \'\', \'\xc3\x80\' => \'A\', \'\xc3\x81\' => \'A\', \'\xc3\x82\' => \'A\', \'\xc3\x83\' => \'A\', \'\xc3\x84\' => \'Ae\', \'\xc3\x84\' => \'A\', \'\xc3\x85\' => \'A\', \'\xc4\x80\' => \'A\', \'\xc4\x84\' => \'A\', \'\xc4\x82\' => \'A\', \'\xc3\x86\' => \'Ae\', \'\xc3\x87\' => \'C\', "\'" => \'\', \'\xc4\x86\' => \'C\', \'\xc4\x8c\' => \'C\', \'\xc4\x88\' => \'C\', \'\xc4\x8a\' => \'C\', \'\xc4\x8e\' => \'D\', \'\xc4\x90\' => \'D\', \'\xc3\x90\' => \'D\', \'\xc3\x88\' => \'E\', \'\xc3\x89\' => \'E\', \'\xc3\x8a\' => \'E\', \'\xc3\x8b\' => \'E\', \'\xc4\x92\' => \'E\', \'\xc4\x98\' => \'E\', \'\xc4\x9a\' => \'E\', \'\xc4\x94\' => \'E\', \'\xc4\x96\' => \'E\', \'\xc4\x9c\' => \'G\', \'\xc4\x9e\' => \'G\', \'\xc4\xa0\' => \'G\', \'\xc4\xa2\' => \'G\', \'\xc4\xa4\' => \'H\', \'\xc4\xa6\' => \'H\', \'\xc3\x8c\' => \'I\', \'\xc3\x8d\' => \'I\', \'\xc3\x8e\' => \'I\', \'\xc3\x8f\' => \'I\', \'\xc4\xaa\' => \'I\', \'\xc4\xa8\' => \'I\', \'\xc4\xac\' => \'I\', \'\xc4\xae\' => \'I\', \'\xc4\xb0\' => \'I\', \'\xc4\xb2\' => \'IJ\', \'\xc4\xb4\' => \'J\', \'\xc4\xb6\' => \'K\', \'\xc5\x81\' => \'L\', \'\xc4\xbd\' => \'L\', \'\xc4\xb9\' => \'L\', \'\xc4\xbb\' => \'L\', \'\xc4\xbf\' => \'L\', \'\xc3\x91\' => \'N\', \'\xc5\x83\' => \'N\', \'\xc5\x87\' => \'N\', \'\xc5\x85\' => \'N\', \'\xc5\x8a\' => \'N\', \'\xc3\x92\' => \'O\', \'\xc3\x93\' => \'O\', \'\xc3\x94\' => \'O\', \'\xc3\x95\' => \'O\', \'\xc3\x96\' => \'Oe\', \'\xc3\x96\' => \'Oe\', \'\xc3\x98\' => \'O\', \'\xc5\x8c\' => \'O\', \'\xc5\x90\' => \'O\', \'\xc5\x8e\' => \'O\', \'\xc5\x92\' => \'OE\', \'\xc5\x94\' => \'R\', \'\xc5\x98\' => \'R\', \'\xc5\x96\' => \'R\', \'\xc5\x9a\' => \'S\', \'\xc5\xa0\' => \'S\', \'\xc5\x9e\' => \'S\', \'\xc5\x9c\' => \'S\', \'\xc8\x98\' => \'S\', \'\xc5\xa4\' => \'T\', \'\xc5\xa2\' => \'T\', \'\xc5\xa6\' => \'T\', \'\xc8\x9a\' => \'T\', \'\xc3\x99\' => \'U\', \'\xc3\x9a\' => \'U\', \'\xc3\x9b\' => \'U\', \'\xc3\x9c\' => \'Ue\', \'\xc5\xaa\' => \'U\', \'\xc3\x9c\' => \'Ue\', \'\xc5\xae\' => \'U\', \'\xc5\xb0\' => \'U\', \'\xc5\xac\' => \'U\', \'\xc5\xa8\' => \'U\', \'\xc5\xb2\' => \'U\', \'\xc5\xb4\' => \'W\', \'\xc3\x9d\' => \'Y\', \'\xc5\xb6\' => \'Y\', \'\xc5\xb8\' => \'Y\', \'\xc5\xb9\' => \'Z\', \'\xc5\xbd\' => \'Z\', \'\xc5\xbb\' => \'Z\', \'\xc3\x9e\' => \'T\', \'\xc3\xa0\' => \'a\', \'\xc3\xa1\' => \'a\', \'\xc3\xa2\' => \'a\', \'\xc3\xa3\' => \'a\', \'\xc3\xa4\' => \'ae\', \'\xc3\xa4\' => \'ae\', \'\xc3\xa5\' => \'a\', \'\xc4\x81\' => \'a\', \'\xc4\x85\' => \'a\', \'\xc4\x83\' => \'a\', \'\xc3\xa6\' => \'ae\', \'\xc3\xa7\' => \'c\', \'\xc4\x87\' => \'c\', \'\xc4\x8d\' => \'c\', \'\xc4\x89\' => \'c\', \'\xc4\x8b\' => \'c\', \'\xc4\x8f\' => \'d\', \'\xc4\x91\' => \'d\', \'\xc3\xb0\' => \'d\', \'\xc3\xa8\' => \'e\', \'\xc3\xa9\' => \'e\', \'\xc3\xaa\' => \'e\', \'\xc3\xab\' => \'e\', \'\xc4\x93\' => \'e\', \'\xc4\x99\' => \'e\', \'\xc4\x9b\' => \'e\', \'\xc4\x95\' => \'e\', \'\xc4\x97\' => \'e\', \'\xc6\x92\' => \'f\', \'\xc4\x9d\' => \'g\', \'\xc4\x9f\' => \'g\', \'\xc4\xa1\' => \'g\', \'\xc4\xa3\' => \'g\', \'\xc4\xa5\' => \'h\', \'\xc4\xa7\' => \'h\', \'\xc3\xac\' => \'i\', \'\xc3\xad\' => \'i\', \'\xc3\xae\' => \'i\', \'\xc3\xaf\' => \'i\', \'\xc4\xab\' => \'i\', \'\xc4\xa9\' => \'i\', \'\xc4\xad\' => \'i\', \'\xc4\xaf\' => \'i\', \'\xc4\xb1\' => \'i\', \'\xc4\xb3\' => \'ij\', \'\xc4\xb5\' => \'j\', \'\xc4\xb7\' => \'k\', \'\xc4\xb8\' => \'k\', \'\xc5\x82\' => \'l\', \'\xc4\xbe\' => \'l\', \'\xc4\xba\' => \'l\', \'\xc4\xbc\' => \'l\', \'\xc5\x80\' => \'l\', \'\xc3\xb1\' => \'n\', \'\xc5\x84\' => \'n\', \'\xc5\x88\' => \'n\', \'\xc5\x86\' => \'n\', \'\xc5\x89\' => \'n\', \'\xc5\x8b\' => \'n\', \'\xc3\xb2\' => \'o\', \'\xc3\xb3\' => \'o\', \'\xc3\xb4\' => \'o\', \'\xc3\xb5\' => \'o\', \'\xc3\xb6\' => \'oe\', \'\xc3\xb6\' => \'oe\', \'\xc3\xb8\' => \'o\', \'\xc5\x8d\' => \'o\', \'\xc5\x91\' => \'o\', \'\xc5\x8f\' => \'o\', \'\xc5\x93\' => \'oe\', \'\xc5\x95\' => \'r\', \'\xc5\x99\' => \'r\', \'\xc5\x97\' => \'r\', \'\xc5\xa1\' => \'s\', \'\xc5\x9b\' => \'s\', \'\xc3\xb9\' => \'u\', \'\xc3\xba\' => \'u\', \'\xc3\xbb\' => \'u\', \'\xc3\xbc\' => \'ue\', \'\xc5\xab\' => \'u\', \'\xc3\xbc\' => \'ue\', \'\xc5\xaf\' => \'u\', \'\xc5\xb1\' => \'u\', \'\xc5\xad\' => \'u\', \'\xc5\xa9\' => \'u\', \'\xc5\xb3\' => \'u\', \'\xc5\xb5\' => \'w\', \'\xc3\xbd\' => \'y\', \'\xc3\xbf\' => \'y\', \'\xc5\xb7\' => \'y\', \'\xc5\xbe\' => \'z\', \'\xc5\xbc\' => \'z\', \'\xc5\xba\' => \'z\', \'\xc3\xbe\' => \'t\', \'\xce\xb1\' => \'a\', \'\xc3\x9f\' => \'ss\', \'\xe1\xba\x9e\' => \'b\', \'\xc5\xbf\' => \'ss\', \'\xd1\x8b\xd0\xb9\' => \'iy\', \'\xd0\x90\' => \'A\', \'\xd0\x91\' => \'B\', \'\xd0\x92\' => \'V\', \'\xd0\x93\' => \'G\', \'\xd0\x94\' => \'D\', \'\xd0\x95\' => \'E\', \'\xd0\x81\' => \'YO\', \'\xd0\x96\' => \'ZH\', \'\xd0\x97\' => \'Z\', \'\xd0\x98\' => \'I\', \'\xd0\x99\' => \'Y\', \'\xd0\x9a\' => \'K\', \'\xd0\x9b\' => \'L\', \'\xd0\x9c\' => \'M\', \'\xd0\x9d\' => \'N\', \'\xd0\x9e\' => \'O\', \'\xd0\x9f\' => \'P\', \'\xd0\xa0\' => \'R\', \'\xd0\xa1\' => \'S\', \'\xd0\xa2\' => \'T\', \'\xd0\xa3\' => \'U\', \'\xd0\xa4\' => \'F\', \'\xd0\xa5\' => \'H\', \'\xd0\xa6\' => \'C\', \'\xd0\xa7\' => \'CH\', \'\xd0\xa8\' => \'SH\', \'\xd0\xa9\' => \'SCH\', \'\xd0\xaa\' => \'\', \'\xd0\xab\' => \'Y\', \'\xd0\xac\' => \'\', \'\xd0\xad\' => \'E\', \'\xd0\xae\' => \'YU\', \'\xd0\xaf\' => \'YA\', \'\xd0\xb0\' => \'a\', \'\xd0\xb1\' => \'b\', \'\xd0\xb2\' => \'v\', \'\xd0\xb3\' => \'g\', \'\xd0\xb4\' => \'d\', \'\xd0\xb5\' => \'e\', \'\xd1\x91\' => \'yo\', \'\xd0\xb6\' => \'zh\', \'\xd0\xb7\' => \'z\', \'\xd0\xb8\' => \'i\', \'\xd0\xb9\' => \'y\', \'\xd0\xba\' => \'k\', \'\xd0\xbb\' => \'l\', \'\xd0\xbc\' => \'m\', \'\xd0\xbd\' => \'n\', \'\xd0\xbe\' => \'o\', \'\xd0\xbf\' => \'p\', \'\xd1\x80\' => \'r\', \'\xd1\x81\' => \'s\', \'\xd1\x82\' => \'t\', \'\xd1\x83\' => \'u\', \'\xd1\x84\' => \'f\', \'\xd1\x85\' => \'h\', \'\xd1\x86\' => \'c\', \'\xd1\x87\' => \'ch\', \'\xd1\x88\' => \'sh\', \'\xd1\x89\' => \'sch\', \'\xd1\x8a\' => \'\', \'\xd1\x8b\' => \'y\', \'\xd1\x8c\' => \'\', \'\xd1\x8d\' => \'e\', \'\xd1\x8e\' => \'yu\', \'\xd1\x8f\' => \'ya\', \'.\' => \'-\', \'\xe2\x82\xac\' => \'-eur-\', \'$\' => \'-usd-\'\n ];\n // Replace non-ascii characters\n $text = strtr($text, $replacements);\n // Replace non letter or digits with "-"\n $text = preg_replace(\'~[^\\pL\\d.]+~u\', \'-\', $text);\n // Replace unwanted characters with "-"\n $text = preg_replace(\'~[^-\\w.]+~\', \'-\', $text);\n // Trim "-"\n $text = trim($text, \'-\');\n // Remove duplicate "-"\n $text = preg_replace(\'~-+~\', \'-\', $text);\n // Convert to lowercase\n $text = strtolower($text);\n // Limit length\n if (isset($length) && $length < strlen($text))\n $text = rtrim(substr($text, 0, $length), \'-\');\n\n return $text;\n}\n$text = "--- You can\'t misuse me! Or can-ya? \xc4\x8c\xc4\x86\xc5\xbd\xc5\xa0\xc4\x90\xc3\xb7\xc3\x97\xc3\x9f\xc2\xa4_.,:;-!\\"#$%&/()=?*~\xcb\x87^\xcb\x98\xc2\xb0\xcb\x9b`\xcb\x99\xc2\xb4\xcb\x9d\xc2\xa8\xc2\xb8\xc2\xb8\xc2\xa8\xc5\x81\xc5\x82\xe2\x82\xac\\|@{}[] \xc2\xbf \xc3\x80\xc3\xb1dr\xc3\xa9\xc3\x9f l\'affreux \xc4\x9far\xc3\xa7on & n\xc3\xb8\xc3\xabl en for\xc3\xaat ! Andr\xc3\xa9s Cortez EFI\xe6\x94\xb6\xe8\xb4\xadCretaprint \xc3\x89tienne";\necho "text\\n$text\\n\\nslug\\n".slugify($text);\nRun Code Online (Sandbox Code Playgroud)\ntext\n--- You can\'t misuse me! Or can-ya? \xc4\x8c\xc4\x86\xc5\xbd\xc5\xa0\xc4\x90\xc3\xb7\xc3\x97\xc3\x9f\xc2\xa4_.,:;-!"#$%&/()=?*~\xcb\x87^\xcb\x98\xc2\xb0\xcb\x9b`\xcb\x99\xc2\xb4\xcb\x9d\xc2\xa8\xc2\xb8\xc2\xb8\xc2\xa8\xc5\x81\xc5\x82\xe2\x82\xac\\|@{}[] \xc2\xbf \xc3\x80\xc3\xb1dr\xc3\xa9\xc3\x9f l\'affreux \xc4\x9far\xc3\xa7on & n\xc3\xb8\xc3\xabl en for\xc3\xaat ! Andr\xc3\xa9s Cortez EFI\xe6\x94\xb6\xe8\xb4\xadCretaprint \xc3\x89tienne\n\nslug\nyou-cant-misuse-me-or-can-ya-cczsd-ss-usd-ll-eur-andress-laffreux-garcon-noel-en-foret-andres-cortez-efi-cretaprint-etienne\nRun Code Online (Sandbox Code Playgroud)\n它也适用于 OP\ 的转换为的情况\'Andr\xc3\xa9s Cortez\'以及\'andres-cortez\'我在该线程中找到的所有其他示例,除了这个超出我范围的字符:。
我很高兴知道您发现的错误(希望附有建议)。
\n这是另一个,例如"带有奇怪字符的标题éééAXZ"变成"标题 - 奇怪的字符 - eee-axz".
/**
* Function used to create a slug associated to an "ugly" string.
*
* @param string $string the string to transform.
*
* @return string the resulting slug.
*/
public static function createSlug($string) {
$table = array(
'Š'=>'S', 'š'=>'s', '?'=>'Dj', '?'=>'dj', 'Ž'=>'Z', 'ž'=>'z', '?'=>'C', '?'=>'c', '?'=>'C', '?'=>'c',
'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A', 'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I', 'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O',
'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U', 'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss',
'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e',
'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o',
'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b',
'ÿ'=>'y', '?'=>'R', '?'=>'r', '/' => '-', ' ' => '-'
);
// -- Remove duplicated spaces
$stripped = preg_replace(array('/\s{2,}/', '/[\t\n]/'), ' ', $string);
// -- Returns the slug
return strtolower(strtr($string, $table));
}
Run Code Online (Sandbox Code Playgroud)
@Imran Omar Bukhsh代码的更新版本(来自最新的Wordpress(4.0)分支):
<?php
// Add methods to slugify taken from Wordpress:
// - https://github.com/WordPress/WordPress/blob/master/wp-includes/formatting.php
// - https://github.com/WordPress/WordPress/blob/master/wp-includes/functions.php
/**
* Set the mbstring internal encoding to a binary safe encoding when func_overload
* is enabled.
*
* When mbstring.func_overload is in use for multi-byte encodings, the results from
* strlen() and similar functions respect the utf8 characters, causing binary data
* to return incorrect lengths.
*
* This function overrides the mbstring encoding to a binary-safe encoding, and
* resets it to the users expected encoding afterwards through the
* `reset_mbstring_encoding` function.
*
* It is safe to recursively call this function, however each
* `mbstring_binary_safe_encoding()` call must be followed up with an equal number
* of `reset_mbstring_encoding()` calls.
*
* @since 3.7.0
*
* @see reset_mbstring_encoding()
*
* @param bool $reset Optional. Whether to reset the encoding back to a previously-set encoding.
* Default false.
*/
function mbstring_binary_safe_encoding( $reset = false ) {
static $encodings = array();
static $overloaded = null;
if ( is_null( $overloaded ) )
$overloaded = function_exists( 'mb_internal_encoding' ) && ( ini_get( 'mbstring.func_overload' ) & 2 );
if ( false === $overloaded )
return;
if ( ! $reset ) {
$encoding = mb_internal_encoding();
array_push( $encodings, $encoding );
mb_internal_encoding( 'ISO-8859-1' );
}
if ( $reset && $encodings ) {
$encoding = array_pop( $encodings );
mb_internal_encoding( $encoding );
}
}
/**
* Reset the mbstring internal encoding to a users previously set encoding.
*
* @see mbstring_binary_safe_encoding()
*
* @since 3.7.0
*/
function reset_mbstring_encoding() {
mbstring_binary_safe_encoding( true );
}
/**
* Checks to see if a string is utf8 encoded.
*
* NOTE: This function checks for 5-Byte sequences, UTF8
* has Bytes Sequences with a maximum length of 4.
*
* @author bmorel at ssi dot fr (modified)
* @since 1.2.1
*
* @param string $str The string to be checked
* @return bool True if $str fits a UTF-8 model, false otherwise.
*/
function seems_utf8($str) {
mbstring_binary_safe_encoding();
$length = strlen($str);
reset_mbstring_encoding();
for ($i=0; $i < $length; $i++) {
$c = ord($str[$i]);
if ($c < 0x80) $n = 0; # 0bbbbbbb
elseif (($c & 0xE0) == 0xC0) $n=1; # 110bbbbb
elseif (($c & 0xF0) == 0xE0) $n=2; # 1110bbbb
elseif (($c & 0xF8) == 0xF0) $n=3; # 11110bbb
elseif (($c & 0xFC) == 0xF8) $n=4; # 111110bb
elseif (($c & 0xFE) == 0xFC) $n=5; # 1111110b
else return false; # Does not match any model
for ($j=0; $j<$n; $j++) { # n bytes matching 10bbbbbb follow ?
if ((++$i == $length) || ((ord($str[$i]) & 0xC0) != 0x80))
return false;
}
}
return true;
}
/**
* Encode the Unicode values to be used in the URI.
*
* @since 1.5.0
*
* @param string $utf8_string
* @param int $length Max length of the string
* @return string String with Unicode encoded for URI.
*/
function utf8_uri_encode( $utf8_string, $length = 0 ) {
$unicode = '';
$values = array();
$num_octets = 1;
$unicode_length = 0;
mbstring_binary_safe_encoding();
$string_length = strlen( $utf8_string );
reset_mbstring_encoding();
for ($i = 0; $i < $string_length; $i++ ) {
$value = ord( $utf8_string[ $i ] );
if ( $value < 128 ) {
if ( $length && ( $unicode_length >= $length ) )
break;
$unicode .= chr($value);
$unicode_length++;
} else {
if ( count( $values ) == 0 ) $num_octets = ( $value < 224 ) ? 2 : 3;
$values[] = $value;
if ( $length && ( $unicode_length + ($num_octets * 3) ) > $length )
break;
if ( count( $values ) == $num_octets ) {
if ($num_octets == 3) {
$unicode .= '%' . dechex($values[0]) . '%' . dechex($values[1]) . '%' . dechex($values[2]);
$unicode_length += 9;
} else {
$unicode .= '%' . dechex($values[0]) . '%' . dechex($values[1]);
$unicode_length += 6;
}
$values = array();
$num_octets = 1;
}
}
}
return $unicode;
}
/**
* Sanitizes a title, replacing whitespace and a few other characters with dashes.
*
* Limits the output to alphanumeric characters, underscore (_) and dash (-).
* Whitespace becomes a dash.
*
* @since 1.2.0
*
* @param string $title The title to be sanitized.
* @param string $raw_title Optional. Not used.
* @param string $context Optional. The operation for which the string is sanitized.
* @return string The sanitized title.
*/
function sanitize_title_with_dashes( $title, $raw_title = '', $context = 'display' ) {
$title = strip_tags($title);
// Preserve escaped octets.
$title = preg_replace('|%([a-fA-F0-9][a-fA-F0-9])|', '---$1---', $title);
// Remove percent signs that are not part of an octet.
$title = str_replace('%', '', $title);
// Restore octets.
$title = preg_replace('|---([a-fA-F0-9][a-fA-F0-9])---|', '%$1', $title);
if (seems_utf8($title)) {
if (function_exists('mb_strtolower')) {
$title = mb_strtolower($title, 'UTF-8');
}
$title = utf8_uri_encode($title, 200);
}
$title = strtolower($title);
$title = preg_replace('/&.+?;/', '', $title); // kill entities
$title = str_replace('.', '-', $title);
if ( 'save' == $context ) {
// Convert nbsp, ndash and mdash to hyphens
$title = str_replace( array( '%c2%a0', '%e2%80%93', '%e2%80%94' ), '-', $title );
// Strip these characters entirely
$title = str_replace( array(
// iexcl and iquest
'%c2%a1', '%c2%bf',
// angle quotes
'%c2%ab', '%c2%bb', '%e2%80%b9', '%e2%80%ba',
// curly quotes
'%e2%80%98', '%e2%80%99', '%e2%80%9c', '%e2%80%9d',
'%e2%80%9a', '%e2%80%9b', '%e2%80%9e', '%e2%80%9f',
// copy, reg, deg, hellip and trade
'%c2%a9', '%c2%ae', '%c2%b0', '%e2%80%a6', '%e2%84%a2',
// acute accents
'%c2%b4', '%cb%8a', '%cc%81', '%cd%81',
// grave accent, macron, caron
'%cc%80', '%cc%84', '%cc%8c',
), '', $title );
// Convert times to x
$title = str_replace( '%c3%97', 'x', $title );
}
$title = preg_replace('/[^%a-z0-9 _-]/', '', $title);
$title = preg_replace('/\s+/', '-', $title);
$title = preg_replace('|-+|', '-', $title);
$title = trim($title, '-');
return $title;
}
$title = '#PFW Alexander McQueen Spring/Summer 2015';
echo "title -> slug: \n". $title ." -> ". sanitize_title_with_dashes($title);
echo "\n\n";
$title = '«GQ»: Elyas M\'Barek gehört zu Männern des Jahres';
echo "title -> slug: \n". $title ." -> ". sanitize_title_with_dashes($title);
Run Code Online (Sandbox Code Playgroud)
查看在线示例.
小智 7
public static function slugify ($text) {
$replace = [
'<' => '', '>' => '', ''' => '', '&' => '',
'"' => '', 'À' => 'A', 'Á' => 'A', 'Â' => 'A', 'Ã' => 'A', 'Ä'=> 'Ae',
'Ä' => 'A', 'Å' => 'A', '?' => 'A', '?' => 'A', '?' => 'A', 'Æ' => 'Ae',
'Ç' => 'C', '?' => 'C', '?' => 'C', '?' => 'C', '?' => 'C', '?' => 'D', '?' => 'D',
'Ð' => 'D', 'È' => 'E', 'É' => 'E', 'Ê' => 'E', 'Ë' => 'E', '?' => 'E',
'?' => 'E', '?' => 'E', '?' => 'E', '?' => 'E', '?' => 'G', '?' => 'G',
'?' => 'G', '?' => 'G', '?' => 'H', '?' => 'H', 'Ì' => 'I', 'Í' => 'I',
'Î' => 'I', 'Ï' => 'I', '?' => 'I', '?' => 'I', '?' => 'I', '?' => 'I',
'?' => 'I', '?' => 'IJ', '?' => 'J', '?' => 'K', '?' => 'K', '?' => 'K',
'?' => 'K', '?' => 'K', '?' => 'K', 'Ñ' => 'N', '?' => 'N', '?' => 'N',
'?' => 'N', '?' => 'N', 'Ò' => 'O', 'Ó' => 'O', 'Ô' => 'O', 'Õ' => 'O',
'Ö' => 'Oe', 'Ö' => 'Oe', 'Ø' => 'O', '?' => 'O', '?' => 'O', '?' => 'O',
'Œ' => 'OE', '?' => 'R', '?' => 'R', '?' => 'R', '?' => 'S', 'Š' => 'S',
'?' => 'S', '?' => 'S', '?' => 'S', '?' => 'T', '?' => 'T', '?' => 'T',
'?' => 'T', 'Ù' => 'U', 'Ú' => 'U', 'Û' => 'U', 'Ü' => 'Ue', '?' => 'U',
'Ü' => 'Ue', '?' => 'U', '?' => 'U', '?' => 'U', '?' => 'U', '?' => 'U',
'?' => 'W', 'Ý' => 'Y', '?' => 'Y', 'Ÿ' => 'Y', '?' => 'Z', 'Ž' => 'Z',
'?' => 'Z', 'Þ' => 'T', 'à' => 'a', 'á' => 'a', 'â' => 'a', 'ã' => 'a',
'ä' => 'ae', 'ä' => 'ae', 'å' => 'a', '?' => 'a', '?' => 'a', '?' => 'a',
'æ' => 'ae', 'ç' => 'c', '?' => 'c', '?' => 'c', '?' => 'c', '?' => 'c',
'?' => 'd', '?' => 'd', 'ð' => 'd', 'è' => 'e', 'é' => 'e', 'ê' => 'e',
'ë' => 'e', '?' => 'e', '?' => 'e', '?' => 'e', '?' => 'e', '?' => 'e',
'ƒ' => 'f', '?' => 'g', '?' => 'g', '?' => 'g', '?' => 'g', '?' => 'h',
'?' => 'h', 'ì' => 'i', 'í' => 'i', 'î' => 'i', 'ï' => 'i', '?' => 'i',
'?' => 'i', '?' => 'i', '?' => 'i', '?' => 'i', '?' => 'ij', '?' => 'j',
'?' => 'k', '?' => 'k', '?' => 'l', '?' => 'l', '?' => 'l', '?' => 'l',
'?' => 'l', 'ñ' => 'n', '?' => 'n', '?' => 'n', '?' => 'n', '?' => 'n',
'?' => 'n', 'ò' => 'o', 'ó' => 'o', 'ô' => 'o', 'õ' => 'o', 'ö' => 'oe',
'ö' => 'oe', 'ø' => 'o', '?' => 'o', '?' => 'o', '?' => 'o', 'œ' => 'oe',
'?' => 'r', '?' => 'r', '?' => 'r', 'š' => 's', 'ù' => 'u', 'ú' => 'u',
'û' => 'u', 'ü' => 'ue', '?' => 'u', 'ü' => 'ue', '?' => 'u', '?' => 'u',
'?' => 'u', '?' => 'u', '?' => 'u', '?' => 'w', 'ý' => 'y', 'ÿ' => 'y',
'?' => 'y', 'ž' => 'z', '?' => 'z', '?' => 'z', 'þ' => 't', 'ß' => 'ss',
'?' => 'ss', '??' => 'iy', '?' => 'A', '?' => 'B', '?' => 'V', '?' => 'G',
'?' => 'D', '?' => 'E', '?' => 'YO', '?' => 'ZH', '?' => 'Z', '?' => 'I',
'?' => 'Y', '?' => 'K', '?' => 'L', '?' => 'M', '?' => 'N', '?' => 'O',
'?' => 'P', '?' => 'R', '?' => 'S', '?' => 'T', '?' => 'U', '?' => 'F',
'?' => 'H', '?' => 'C', '?' => 'CH', '?' => 'SH', '?' => 'SCH', '?' => '',
'?' => 'Y', '?' => '', '?' => 'E', '?' => 'YU', '?' => 'YA', '?' => 'a',
'?' => 'b', '?' => 'v', '?' => 'g', '?' => 'd', '?' => 'e', '?' => 'yo',
'?' => 'zh', '?' => 'z', '?' => 'i', '?' => 'y', '?' => 'k', '?' => 'l',
'?' => 'm', '?' => 'n', '?' => 'o', '?' => 'p', '?' => 'r', '?' => 's',
'?' => 't', '?' => 'u', '?' => 'f', '?' => 'h', '?' => 'c', '?' => 'ch',
'?' => 'sh', '?' => 'sch', '?' => '', '?' => 'y', '?' => '', '?' => 'e',
'?' => 'yu', '?' => 'ya'
];
// make a human readable string
$text = strtr($text, $replace);
// replace non letter or digits by -
$text = preg_replace('~[^\\pL\d.]+~u', '-', $text);
// trim
$text = trim($text, '-');
// remove unwanted characters
$text = preg_replace('~[^-\w.]+~', '', $text);
$text = strtolower($text);
return $text;
}
Run Code Online (Sandbox Code Playgroud)
不要为此使用 preg_replace。有一个专为该任务构建的 php 函数:strtr() http://php.net/manual/en/function.strtr.php
摘自上述链接中的评论(我自己测试过;它有效:
function normalize ($string) {
$table = array(
'Š'=>'S', 'š'=>'s', '?'=>'Dj', '?'=>'dj', 'Ž'=>'Z', 'ž'=>'z', '?'=>'C', '?'=>'c', '?'=>'C', '?'=>'c',
'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A', 'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I', 'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O',
'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U', 'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss',
'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e',
'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o',
'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b',
'ÿ'=>'y', '?'=>'R', '?'=>'r',
);
return strtr($string, $table);
}
Run Code Online (Sandbox Code Playgroud)
我在用:
function slugify($text)
{
$text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);
return strtolower(preg_replace('/[^A-Za-z0-9-]+/', '-', $text));
}
Run Code Online (Sandbox Code Playgroud)
唯一的后退是西里尔字符不会被转换,我现在正在寻找对于每个西里尔字符不长 str_replace 的解决方案。
小智 5
我不知道该使用哪个,所以我在 phptester.net 上做了一个快速的工作台
<?php
// First test
// /sf/answers/2991861211/
function slugify(STRING $string, STRING $separator = '-'){
$accents_regex = '~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i';
$special_cases = [ '&' => 'and', "'" => ''];
$string = mb_strtolower( trim( $string ), 'UTF-8' );
$string = str_replace( array_keys($special_cases), array_values( $special_cases), $string );
$string = preg_replace( $accents_regex, '$1', htmlentities( $string, ENT_QUOTES, 'UTF-8' ) );
$string = preg_replace('/[^a-z0-9]/u', $separator, $string);
return preg_replace('/['.$separator.']+/u', $separator, $string);
}
// Second test
// /sf/answers/933236391/
function slug(STRING $string, STRING $separator = '-'){
$string = transliterator_transliterate('Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC; [:Punctuation:] Remove; Lower();', $string);
return str_replace(' ', $separator, $string);;
}
// Third test - My choice
// /sf/answers/2664629551/
function slugbis($text){
$replace = [
'<' => '', '>' => '', '-' => ' ', '&' => '',
'"' => '', 'À' => 'A', 'Á' => 'A', 'Â' => 'A', 'Ã' => 'A', 'Ä'=> 'Ae',
'Ä' => 'A', 'Å' => 'A', '?' => 'A', '?' => 'A', '?' => 'A', 'Æ' => 'Ae',
'Ç' => 'C', '?' => 'C', '?' => 'C', '?' => 'C', '?' => 'C', '?' => 'D', '?' => 'D',
'Ð' => 'D', 'È' => 'E', 'É' => 'E', 'Ê' => 'E', 'Ë' => 'E', '?' => 'E',
'?' => 'E', '?' => 'E', '?' => 'E', '?' => 'E', '?' => 'G', '?' => 'G',
'?' => 'G', '?' => 'G', '?' => 'H', '?' => 'H', 'Ì' => 'I', 'Í' => 'I',
'Î' => 'I', 'Ï' => 'I', '?' => 'I', '?' => 'I', '?' => 'I', '?' => 'I',
'?' => 'I', '?' => 'IJ', '?' => 'J', '?' => 'K', '?' => 'K', '?' => 'K',
'?' => 'K', '?' => 'K', '?' => 'K', 'Ñ' => 'N', '?' => 'N', '?' => 'N',
'?' => 'N', '?' => 'N', 'Ò' => 'O', 'Ó' => 'O', 'Ô' => 'O', 'Õ' => 'O',
'Ö' => 'Oe', 'Ö' => 'Oe', 'Ø' => 'O', '?' => 'O', '?' => 'O', '?' => 'O',
'Œ' => 'OE', '?' => 'R', '?' => 'R', '?' => 'R', '?' => 'S', 'Š' => 'S',
'?' => 'S', '?' => 'S', '?' => 'S', '?' => 'T', '?' => 'T', '?' => 'T',
'?' => 'T', 'Ù' => 'U', 'Ú' => 'U', 'Û' => 'U', 'Ü' => 'Ue', '?' => 'U',
'Ü' => 'Ue', '?' => 'U', '?' => 'U', '?' => 'U', '?' => 'U', '?' => 'U',
'?' => 'W', 'Ý' => 'Y', '?' => 'Y', 'Ÿ' => 'Y', '?' => 'Z', 'Ž' => 'Z',
'?' => 'Z', 'Þ' => 'T', 'à' => 'a', 'á' => 'a', 'â' => 'a', 'ã' => 'a',
'ä' => 'ae', 'ä' => 'ae', 'å' => 'a', '?' => 'a', '?' => 'a', '?' => 'a',
'æ' => 'ae', 'ç' => 'c', '?' => 'c', '?' => 'c', '?' => 'c', '?' => 'c',
'?' => 'd', '?' => 'd', 'ð' => 'd', 'è' => 'e', 'é' => 'e', 'ê' => 'e',
'ë' => 'e', '?' => 'e', '?' => 'e', '?' => 'e', '?' => 'e', '?' => 'e',
'ƒ' => 'f', '?' => 'g', '?' => 'g', '?' => 'g', '?' => 'g', '?' => 'h',
'?' => 'h', 'ì' => 'i', 'í' => 'i', 'î' => 'i', 'ï' => 'i', '?' => 'i',
'?' => 'i', '?' => 'i', '?' => 'i', '?' => 'i', '?' => 'ij', '?' => 'j',
'?' => 'k', '?' => 'k', '?' => 'l', '?' => 'l', '?' => 'l', '?' => 'l',
'?' => 'l', 'ñ' => 'n', '?' => 'n', '?' => 'n', '?' => 'n', '?' => 'n',
'?' => 'n', 'ò' => 'o', 'ó' => 'o', 'ô' => 'o', 'õ' => 'o', 'ö' => 'oe',
'ö' => 'oe', 'ø' => 'o', '?' => 'o', '?' => 'o', '?' => 'o', 'œ' => 'oe',
'?' => 'r', '?' => 'r', '?' => 'r', 'š' => 's', 'ù' => 'u', 'ú' => 'u',
'û' => 'u', 'ü' => 'ue', '?' => 'u', 'ü' => 'ue', '?' => 'u', '?' => 'u',
'?' => 'u', '?' => 'u', '?' => 'u', '?' => 'w', 'ý' => 'y', 'ÿ' => 'y',
'?' => 'y', 'ž' => 'z', '?' => 'z', '?' => 'z', 'þ' => 't', 'ß' => 'ss',
'?' => 'ss', '??' => 'iy', '?' => 'A', '?' => 'B', '?' => 'V', '?' => 'G',
'?' => 'D', '?' => 'E', '?' => 'YO', '?' => 'ZH', '?' => 'Z', '?' => 'I',
'?' => 'Y', '?' => 'K', '?' => 'L', '?' => 'M', '?' => 'N', '?' => 'O',
'?' => 'P', '?' => 'R', '?' => 'S', '?' => 'T', '?' => 'U', '?' => 'F',
'?' => 'H', '?' => 'C', '?' => 'CH', '?' => 'SH', '?' => 'SCH', '?' => '',
'?' => 'Y', '?' => '', '?' => 'E', '?' => 'YU', '?' => 'YA', '?' => 'a',
'?' => 'b', '?' => 'v', '?' => 'g', '?' => 'd', '?' => 'e', '?' => 'yo',
'?' => 'zh', '?' => 'z', '?' => 'i', '?' => 'y', '?' => 'k', '?' => 'l',
'?' => 'm', '?' => 'n', '?' => 'o', '?' => 'p', '?' => 'r', '?' => 's',
'?' => 't', '?' => 'u', '?' => 'f', '?' => 'h', '?' => 'c', '?' => 'ch',
'?' => 'sh', '?' => 'sch', '?' => '', '?' => 'y', '?' => '', '?' => 'e',
'?' => 'yu', '?' => 'ya'
];
// make a human readable string
$text = strtr($text, $replace);
// replace non letter or digits by -
$text = preg_replace('~[^\pL\d.]+~u', '-', $text);
// trim
$text = trim($text, '-');
// remove unwanted characters
$text = preg_replace('~[^-\w.]+~', '', $text);
return strtolower($text);
}
// Fourth test
// /sf/answers/206886501/
function slugagain($string){
$table = [
'Š'=>'S', 'š'=>'s', '?'=>'Dj', '?'=>'dj', 'Ž'=>'Z', 'ž'=>'z', '?'=>'C', '?'=>'c', '?'=>'C', '?'=>'c',
'À'=>'A', 'Á'=>'A', 'Â'=>'A', 'Ã'=>'A', 'Ä'=>'A', 'Å'=>'A', 'Æ'=>'A', 'Ç'=>'C', 'È'=>'E', 'É'=>'E',
'Ê'=>'E', 'Ë'=>'E', 'Ì'=>'I', 'Í'=>'I', 'Î'=>'I', 'Ï'=>'I', 'Ñ'=>'N', 'Ò'=>'O', 'Ó'=>'O', 'Ô'=>'O',
'Õ'=>'O', 'Ö'=>'O', 'Ø'=>'O', 'Ù'=>'U', 'Ú'=>'U', 'Û'=>'U', 'Ü'=>'U', 'Ý'=>'Y', 'Þ'=>'B', 'ß'=>'Ss',
'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e',
'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o',
'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b',
'ÿ'=>'y', '?'=>'R', '?'=>'r', ' '=>'-'
];
return strtr($string, $table);
}
// Fifth test
// /sf/answers/1917776311/
function slugifybis($url){
$url = trim($url);
$url = str_replace(' ', '-', $url);
$url = str_replace('/', '-slash-', $url);
return rawurlencode($url);
}
// Sixth and last test
// /sf/answers/2760942411/
setlocale( LC_ALL, "en_US.UTF8" );
function slugifyagain($string){
$string = iconv('utf-8', 'us-ascii//translit//ignore', $string); // transliterate
$string = str_replace("'", '', $string);
$string = preg_replace('~[^\pL\d]+~u', '-', $string); // replace non letter or non digits by "-"
$string = preg_replace('~[^-\w]+~', '', $string); // remove unwanted characters
$string = preg_replace('~-+~', '-', $string); // remove duplicate "-"
$string = trim($string, '-'); // trim "-"
$string = trim($string); // trim
$string = mb_strtolower($string, 'utf-8'); // lowercase
return urlencode($string); // safe;
};
$string = $newString = "¿ Àñdréß l'affreux ?arçon & nøël en forêt !";
$max = 10000;
echo '<pre>';
echo 'Beginning :';
echo '<br />';
echo '<br />';
echo '> Slugging '.$max.' iterations of following :';
echo '<br />';
echo '>> ' . $string;
echo '<br />';
echo '<br />';
echo 'Output results :';
echo '<br />';
echo '<br />';
$start = microtime(true);
for($i = 0 ; $i < $max ; $i++){
$newString = slugify($string);
}
$time = (microtime(true) - $start) * 1000;
echo '> First test passed in **' . round($time, 2) . 'ms**';
echo '<br />';
echo '>> Result : ' . $newString;
echo '<br />';
echo '<br />';
$start = microtime(true);
for($i = 0 ; $i < $max ; $i++){
$newString = slug($string);
}
$time = (microtime(true) - $start) * 1000;
echo '> Second test passed in **' . round($time, 2) . 'ms**';
echo '<br />';
echo '>> Result : ' . $newString;
echo '<br />';
echo '<br />';
$start = microtime(true);
for($i = 0 ; $i < $max ; $i++){
$newString = slugbis($string);
}
$time = (microtime(true) - $start) * 1000;
echo '> Third test passed in **' . round($time, 2) . 'ms**';
echo '<br />';
echo '>> Result : ' . $newString;
echo '<br />';
echo '<br />';
$start = microtime(true);
for($i = 0 ; $i < $max ; $i++){
$newString = slugagain($string);
}
$time = (microtime(true) - $start) * 1000;
echo '> Fourth test passed in **' . round($time, 2) . 'ms**';
echo '<br />';
echo '>> Result : ' . $newString;
echo '<br />';
echo '<br />';
$start = microtime(true);
for($i = 0 ; $i < $max ; $i++){
$newString = slugifybis($string);
}
$time = (microtime(true) - $start) * 1000;
echo '> Fifth test passed in **' . round($time, 2) . 'ms**';
echo '<br />';
echo '>> Result : ' . $newString;
echo '<br />';
echo '<br />';
$start = microtime(true);
for($i = 0 ; $i < $max ; $i++){
$newString = slugifyagain($string);
}
$time = (microtime(true) - $start) * 1000;
echo '> Sixth test passed in **' . round($time, 2) . 'ms**';
echo '<br />';
echo '>> Result : ' . $newString;
echo '</pre>';
Run Code Online (Sandbox Code Playgroud)
开始:
对以下内容进行 10000 次迭代:
¿ Àñdréß l'affreux ?arçon & nøël en forêt !
输出结果:
第一次测试在120.78 毫秒内通过
结果:-iquest-andresz-laffreux-arcon-and-noel-en-foret-
第二次测试在3883.82 毫秒内通过
结果:-andreß-laffreux-garcon--nøel-en-foret-
第三次测试在56.83 毫秒内通过
结果 : andress-l-affreux-garcon-noel-en-foret
第四次测试在18.93 毫秒内通过
结果: ¿-AndreSs-l'affreux-?arcon-&-noel-en-foret-!
5 次测试在6.45 毫秒内通过
结果:%C2%BF-%C3%80%C3%B1dr%C3%A9%C3%9F-l%27affreux-%C4%9Far%C3%A7on-%26-n%C3%B8%C3%ABl- en-for%C3%AAt-%21
在112.42 毫秒内通过了第六次测试
结果 : andress-laffreux-garcon-n-el-en-foret
需要进一步的测试。
编辑:更少的迭代测试
开始:
对以下内容进行 100 次迭代:
¿ Àñdréß l'affreux ?arçon & nøël en forêt !
输出结果:
第一次测试在1.72 毫秒内通过
结果:-iquest-andresz-laffreux-arcon-and-noel-en-foret-
第二次测试在48.59 毫秒内通过
结果:-andreß-laffreux-garcon--nøel-en-foret-
第三次测试在0.91 毫秒内通过
结果 : andress-l-affreux-garcon-noel-en-foret
第四次测试在0.3ms 内通过
结果: ¿-AndreSs-l'affreux-?arcon-&-noel-en-foret-!
第五次测试在0.14 毫秒内通过
结果:%C2%BF-%C3%80%C3%B1dr%C3%A9%C3%9F-l%27affreux-%C4%9Far%C3%A7on-%26-n%C3%B8%C3%ABl- en-for%C3%AAt-%21
在1.4 毫秒内通过了第六次测试
结果 : andress-laffreux-garcon-n-el-en-foret
| 归档时间: |
|
| 查看次数: |
210085 次 |
| 最近记录: |