PHP代码生成安全的URL?

sil*_*npi 5 php regex string sanitization url-rewriting

我们需要从书的标题中生成一个唯一的URL - 标题可以包含任何字符.我们如何搜索 - 替换所有"无效"字符,以便生成有效和整洁的查找URL?

例如:

"The Great Book of PHP"

www.mysite.com/book/12345/the-great-book-of-php

"The Greatest !@#$ Book of PHP"

www.mysite.com/book/12345/the-greatest-book-of-php

"Funny title     "

www.mysite.com/book/12345/funny-title
Run Code Online (Sandbox Code Playgroud)

Mez*_*Mez 15

啊,重击

// This function expects the input to be UTF-8 encoded.
function slugify($text)
{
    // Swap out Non "Letters" with a -
    $text = preg_replace('/[^\\pL\d]+/u', '-', $text); 

    // Trim out extra -'s
    $text = trim($text, '-');

    // Convert letters that we have left to the closest ASCII representation
    $text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);

    // Make text lowercase
    $text = strtolower($text);

    // Strip out anything we haven't been able to convert
    $text = preg_replace('/[^-\w]+/', '', $text);

    return $text;
}
Run Code Online (Sandbox Code Playgroud)

这很好用,因为它首先使用每个字符的unicode属性来确定它是否是一个字母(或者对数字是\n) - 然后它转换那些不是-s的那些 - 然后它音译为ascii,是另一个替代品,然后清理自己.(Fabrik的测试返回"arvizturo-tukorfurogep")

我还倾向于添加一个停用词列表 - 这样就可以从slug中删除它们."the""of"或"""a"等等(但不要长篇大论,或者你删除像"php"这样的东西)


Gum*_*mbo 7

如果"无效"表示非字母数字,则可以执行以下操作:

function foo($str) {
    return trim(preg_replace('/[^a-z0-9]+/', '-', strtolower($str)), '-');
}
Run Code Online (Sandbox Code Playgroud)

这将$str变成小写,用一个连字符替换一个或多个非字母数字字符的任何序列,然后删除前导和尾随连字符.

var_dump(foo("The Great Book of PHP") === 'the-great-book-of-php');
var_dump(foo("The Greatest !@#$ Book of PHP") === 'the-greatest-book-of-php');
var_dump(foo("Funny title     ") === 'funny-title');
Run Code Online (Sandbox Code Playgroud)


小智 0

将特殊字符替换为空格,然后将空格替换为“-”。字符串替换?