从字符串中提取关键字:Javascript

Nik*_*ain 5 javascript arrays list

让我们考虑一下我有一个字符串并想要提取不常见的关键字进行搜索引擎优化。$text = "This is some text. This is some text. Vending Machines are great.";

& 将定义一组常用单词来忽略提取列表中的关键字,例如$commonWords = ['i','a','about','an','and','are','as','at','be','by','com','de','en','for','from','how','in','is','it','la','of','on','or','that','the','this','to','was','what','when','where','who','will','with','und','the','www'];

预期输出:Result=[some,text,machines,vending]

如果有人可以帮助我们编写从字符串中提取关键字的通用逻辑或过程,我将非常感激吗?

Ali*_*ahi 5

这可以帮助(它支持多种语言):

https://github.com/michaeldelorenzo/keyword-extractor

var sentence = "President Obama woke up Monday facing a Congressional defeat that many in both parties believed could hobble his presidency."

//  Extract the keywords
var extraction_result = keyword_extractor.extract(sentence,{
                                                            language:"english",
                                                            remove_digits: true,
                                                            return_changed_case:true,
                                                            remove_duplicates: false

                                                       });
Run Code Online (Sandbox Code Playgroud)


Tom*_*Rup 2

有的像这样

var $commonWords = ['i','a','about','an','and','are','as','at','be','by','com','de','en','for','from','how','in','is','it','la','of','on','or','that','the','this','to','was','what','when','where','who','will','with','und','the','www'];
var $text = "This is some text. This is some text. Vending Machines are great.";

// Convert to lowercase
$text = $text.toLowerCase();

// replace unnesessary chars. leave only chars, numbers and space
$text = $text.replace(/[^\w\d ]/g, '');

var result = $text.split(' ');

// remove $commonWords
result = result.filter(function (word) {
    return $commonWords.indexOf(word) === -1;
});

// Unique words
result = result.unique();

console.log(result);
Run Code Online (Sandbox Code Playgroud)