wus*_*her 245 regex language-agnostic seo friendly-url slug
什么是完整的正则表达式或其他一些可以获得标题的过程:
如何将标题更改为Stack Overflow等URL的一部分?
把它变成
how-do-you-change-a-title-to-be-part-of-the-url-like-stack-overflow
Run Code Online (Sandbox Code Playgroud)
Stack Overflow上的SEO友好URL中使用了哪些?
我使用的开发环境是Ruby on Rails,但是如果还有其他一些特定于平台的解决方案(.NET,PHP,Django),我也很乐意看到它们.
我相信我(或其他读者)会在不同的平台上遇到同样的问题.
我正在使用自定义路由,我主要想知道如何更改字符串以删除所有特殊字符,它全部小写,并且所有空格都被替换.
Jef*_*ood 292
这是我们如何做到的.请注意,乍看之下可能存在比边缘条件更多的边缘条件.
这是第二个版本,展开了5倍以上的性能(是的,我对它进行了基准测试).我想我会优化它,因为这个函数每页可以被调用数百次.
/// <summary>
/// Produces optional, URL-friendly version of a title, "like-this-one".
/// hand-tuned for speed, reflects performance refactoring contributed
/// by John Gietzen (user otac0n)
/// </summary>
public static string URLFriendly(string title)
{
if (title == null) return "";
const int maxlen = 80;
int len = title.Length;
bool prevdash = false;
var sb = new StringBuilder(len);
char c;
for (int i = 0; i < len; i++)
{
c = title[i];
if ((c >= 'a' && c <= 'z') || (c >= '0' && c <= '9'))
{
sb.Append(c);
prevdash = false;
}
else if (c >= 'A' && c <= 'Z')
{
// tricky way to convert to lowercase
sb.Append((char)(c | 32));
prevdash = false;
}
else if (c == ' ' || c == ',' || c == '.' || c == '/' ||
c == '\\' || c == '-' || c == '_' || c == '=')
{
if (!prevdash && sb.Length > 0)
{
sb.Append('-');
prevdash = true;
}
}
else if ((int)c >= 128)
{
int prevlen = sb.Length;
sb.Append(RemapInternationalCharToAscii(c));
if (prevlen != sb.Length) prevdash = false;
}
if (i == maxlen) break;
}
if (prevdash)
return sb.ToString().Substring(0, sb.Length - 1);
else
return sb.ToString();
}
Run Code Online (Sandbox Code Playgroud)
要查看此代码的先前版本(已在功能上等效,并且速度提高了5倍),请查看此帖子的修订历史记录(单击日期链接).
此外,RemapInternationalCharToAscii方法源代码可以在这里找到.
Dan*_*anH 31
这是我的杰夫代码版本.我做了以下更改:
案例转换现在也是可选的.
public static class Slug
{
public static string Create(bool toLower, params string[] values)
{
return Create(toLower, String.Join("-", values));
}
/// <summary>
/// Creates a slug.
/// References:
/// http://www.unicode.org/reports/tr15/tr15-34.html
/// https://meta.stackexchange.com/questions/7435/non-us-ascii-characters-dropped-from-full-profile-url/7696#7696
/// https://stackoverflow.com/questions/25259/how-do-you-include-a-webpage-title-as-part-of-a-webpage-url/25486#25486
/// https://stackoverflow.com/questions/3769457/how-can-i-remove-accents-on-a-string
/// </summary>
/// <param name="toLower"></param>
/// <param name="normalised"></param>
/// <returns></returns>
public static string Create(bool toLower, string value)
{
if (value == null)
return "";
var normalised = value.Normalize(NormalizationForm.FormKD);
const int maxlen = 80;
int len = normalised.Length;
bool prevDash = false;
var sb = new StringBuilder(len);
char c;
for (int i = 0; i < len; i++)
{
c = normalised[i];
if ((c >= 'a' && c <= 'z') || (c >= '0' && c <= '9'))
{
if (prevDash)
{
sb.Append('-');
prevDash = false;
}
sb.Append(c);
}
else if (c >= 'A' && c <= 'Z')
{
if (prevDash)
{
sb.Append('-');
prevDash = false;
}
// Tricky way to convert to lowercase
if (toLower)
sb.Append((char)(c | 32));
else
sb.Append(c);
}
else if (c == ' ' || c == ',' || c == '.' || c == '/' || c == '\\' || c == '-' || c == '_' || c == '=')
{
if (!prevDash && sb.Length > 0)
{
prevDash = true;
}
}
else
{
string swap = ConvertEdgeCases(c, toLower);
if (swap != null)
{
if (prevDash)
{
sb.Append('-');
prevDash = false;
}
sb.Append(swap);
}
}
if (sb.Length == maxlen)
break;
}
return sb.ToString();
}
static string ConvertEdgeCases(char c, bool toLower)
{
string swap = null;
switch (c)
{
case '?':
swap = "i";
break;
case '?':
swap = "l";
break;
case '?':
swap = toLower ? "l" : "L";
break;
case '?':
swap = "d";
break;
case 'ß':
swap = "ss";
break;
case 'ø':
swap = "o";
break;
case 'Þ':
swap = "th";
break;
}
return swap;
}
}
Run Code Online (Sandbox Code Playgroud)有关详细信息,单元测试,以及解释为什么Facebook的URL方案比Stack Overflows更聪明,我在我的博客上有一个扩展版本.
Dal*_*gan 16
您将需要设置自定义路由以将URL指向将处理它的控制器.由于您使用的是Ruby on Rails,因此以下是使用其路由引擎的介绍.
在Ruby中,您将需要一个您已经知道的正则表达式,这里是要使用的正则表达式:
def permalink_for(str)
str.gsub(/[^\w\/]|[!\(\)\.]+/, ' ').strip.downcase.gsub(/\ +/, '-')
end
Run Code Online (Sandbox Code Playgroud)
fij*_*ter 11
你也可以使用这个JavaScript函数来生成slug的形式(这个是基于/从Django复制的):
function makeSlug(urlString, filter) {
// Changes, e.g., "Petty theft" to "petty_theft".
// Remove all these words from the string before URLifying
if(filter) {
removelist = ["a", "an", "as", "at", "before", "but", "by", "for", "from",
"is", "in", "into", "like", "of", "off", "on", "onto", "per",
"since", "than", "the", "this", "that", "to", "up", "via", "het", "de", "een", "en",
"with"];
}
else {
removelist = [];
}
s = urlString;
r = new RegExp('\\b(' + removelist.join('|') + ')\\b', 'gi');
s = s.replace(r, '');
s = s.replace(/[^-\w\s]/g, ''); // Remove unneeded characters
s = s.replace(/^\s+|\s+$/g, ''); // Trim leading/trailing spaces
s = s.replace(/[-\s]+/g, '-'); // Convert spaces to hyphens
s = s.toLowerCase(); // Convert to lowercase
return s; // Trim to first num_chars characters
}
Run Code Online (Sandbox Code Playgroud)
为了更好的衡量,这里是WordPress中的PHP函数,它做到了......我认为WordPress是使用花哨链接的更受欢迎的平台之一.
function sanitize_title_with_dashes($title) {
$title = strip_tags($title);
// Preserve escaped octets.
$title = preg_replace('|%([a-fA-F0-9][a-fA-F0-9])|', '---$1---', $title);
// Remove percent signs that are not part of an octet.
$title = str_replace('%', '', $title);
// Restore octets.
$title = preg_replace('|---([a-fA-F0-9][a-fA-F0-9])---|', '%$1', $title);
$title = remove_accents($title);
if (seems_utf8($title)) {
if (function_exists('mb_strtolower')) {
$title = mb_strtolower($title, 'UTF-8');
}
$title = utf8_uri_encode($title, 200);
}
$title = strtolower($title);
$title = preg_replace('/&.+?;/', '', $title); // kill entities
$title = preg_replace('/[^%a-z0-9 _-]/', '', $title);
$title = preg_replace('/\s+/', '-', $title);
$title = preg_replace('|-+|', '-', $title);
$title = trim($title, '-');
return $title;
}
这个函数以及一些支持函数可以在wp-includes/formatting.php中找到.
我不熟悉Ruby on Rails,但以下是(未经测试的)PHP代码.如果你觉得它很有用,你可以很快地将它翻译成Ruby on Rails.
$sURL = "This is a title to convert to URL-format. It has 1 number in it!";
// To lower-case
$sURL = strtolower($sURL);
// Replace all non-word characters with spaces
$sURL = preg_replace("/\W+/", " ", $sURL);
// Remove trailing spaces (so we won't end with a separator)
$sURL = trim($sURL);
// Replace spaces with separators (hyphens)
$sURL = str_replace(" ", "-", $sURL);
echo $sURL;
// outputs: this-is-a-title-to-convert-to-url-format-it-has-1-number-in-it
Run Code Online (Sandbox Code Playgroud)
我希望这有帮助.
如果您正在使用Rails边缘,您可以依赖Inflector.parametrize - 这是文档中的示例:
class Person
def to_param
"#{id}-#{name.parameterize}"
end
end
@person = Person.find(1)
# => #<Person id: 1, name: "Donald E. Knuth">
<%= link_to(@person.name, person_path(@person)) %>
# => <a href="/person/1-donald-e-knuth">Donald E. Knuth</a>
Run Code Online (Sandbox Code Playgroud)
此外,如果您需要在以前版本的Rails中处理更多异国情调的角色,例如口音(éphémère),您可以使用PermalinkFu和DiacriticsFu的混合:
DiacriticsFu::escape("éphémère")
=> "ephemere"
DiacriticsFu::escape("räksmörgås")
=> "raksmorgas"
Run Code Online (Sandbox Code Playgroud)