Javascript Slug 也适用于非拉丁字符

pau*_*dru 2 javascript url friendly-url slug

基本上我找到了一个如下所示的 slug 函数:

function slug(string) => { 
    return string.toString().toLowerCase()
        .replace(/\s+/g, '-')
        .replace(/[^\w\-]+/g, '')
        .replace(/\-\-+/g, '-')
        .replace(/^-+/, '')
        .replace(/-+$/, '');
};
Run Code Online (Sandbox Code Playgroud)

但是,它似乎不适用于俄语、希腊语、...字符。基本上它们在这一步被删除了.replace(/[^\w\-]+/g, ''),我不想要但我也想删除在某些国家不代表正常字母的其他特殊字符。

例子:

English| Do you know it rains?|do-you-know-it-rains

Czech| víš, že prší?|vis-ze-prsi

Romanian| ?ti c? plou??|sti-ca-ploua

Russian| ?? ??????, ??? ???? ??????|??-??????-???-????-?????

笔记:

基本上对于拉丁字母,我将保留字母但删除变音符号,但对于非拉丁字母,我将保持原样(我不想将它们转换为拉丁字符)

jo_*_*_va 5

下面是一个接近角,对特殊字符的工作。使用一组对象,您可以将要替换的每个特殊字符分类到将要替换它的拉丁字符下。

但是,要保持希腊语和俄语不变,您必须使用将希腊语和俄语视为单词字符的正则表达式,因此在使用上述技巧替换特殊字符后,您必须使用以下正则表达式删除所有非单词字符[^-a-z?-?\u0370-\u03ff\u1f00-\u1fff]

此正则表达式包括破折号、拉丁字符a-z后跟西里尔字母?-?,最后\u0370-\u03ff\u1f00-\u1fff希腊字符的扩展 unicode 范围。

您可以使用此维基百科语言识别图表向该集合中添加更多特殊字符。

function slugify(text) {
  text = text.toString().toLowerCase().trim();

  const sets = [
    {to: 'a', from: '[ÀÁÂÃÄÅÆ????????????????]'},
    {to: 'c', from: '[Ç???]'},
    {to: 'd', from: '[Ð??Þ]'},
    {to: 'e', from: '[ÈÉÊË?????????????]'},
    {to: 'g', from: '[????]'},
    {to: 'h', from: '[??]'},
    {to: 'i', from: '[ÌÍÎÏ??????]'},
    {to: 'j', from: '[?]'},
    {to: 'ij', from: '[?]'},
    {to: 'k', from: '[?]'},
    {to: 'l', from: '[????]'},
    {to: 'm', from: '[?]'},
    {to: 'n', from: '[Ñ???]'},
    {to: 'o', from: '[ÒÓÔÕÖØ??????????????????]'},
    {to: 'oe', from: '[Œ]'},
    {to: 'p', from: '[?]'},
    {to: 'r', from: '[???]'},
    {to: 's', from: '[ß???Š?]'},
    {to: 't', from: '[??]'},
    {to: 'u', from: '[ÙÚÛÜ??????????????]'},
    {to: 'w', from: '[????]'},
    {to: 'x', from: '[?]'},
    {to: 'y', from: '[Ý?Ÿ????]'},
    {to: 'z', from: '[??Ž]'},
    {to: '-', from: '[·/_,:;\']'}
  ];

  sets.forEach(set => {
    text = text.replace(new RegExp(set.from,'gi'), set.to)
  });

  return text
    .replace(/\s+/g, '-')    // Replace spaces with -
    .replace(/[^-a-z?-?\u0370-\u03ff\u1f00-\u1fff]+/g, '') // Remove all non-word chars
    .replace(/--+/g, '-')    // Replace multiple - with single -
    .replace(/^-+/, '')      // Trim - from start of text
    .replace(/-+$/, '')      // Trim - from end of text
}

console.log(slugify('Do you know it rains?'));
console.log(slugify('víš, že prší?'));
console.log(slugify('?ti c? plou??'));
console.log(slugify('?? ??????, ??? ???? ??????'));
console.log(slugify('??? ????? ????? ?? ??????'));
Run Code Online (Sandbox Code Playgroud)