如果我有两个这样的字符串
s1 = "This is a foo bar sentence ."
s2 = "This sentence is similar to a foo bar sentence ."
Run Code Online (Sandbox Code Playgroud)
我想将字符串拆分为这种格式
x1 = ["This":1,"is":1,"a":1,"bar":1,"sentence":1,"foo":1]
x2 = ["This":1,"is":1,"a":1,"bar":1,"sentence":2,"similar":1,"to":1,"foo":1]
Run Code Online (Sandbox Code Playgroud)
它将字符串单词拆分并计数,一对,每个字符串代表一个单词,数字代表字符串中该单词的计数.
删除标点符号,规范化空格,小写,在空格处拆分,使用循环将单词出现计数到索引对象中.
function countWords(sentence) {
var index = {},
words = sentence
.replace(/[.,?!;()"'-]/g, " ")
.replace(/\s+/g, " ")
.toLowerCase()
.split(" ");
words.forEach(function (word) {
if (!(index.hasOwnProperty(word))) {
index[word] = 0;
}
index[word]++;
});
return index;
}
Run Code Online (Sandbox Code Playgroud)
或者,在ES6箭头功能样式中:
const countWords = sentence => sentence
.replace(/[.,?!;()"'-]/g, " ")
.replace(/\s+/g, " ")
.toLowerCase()
.split(" ")
.reduce((index, word) => {
if (!(index.hasOwnProperty(word))) index[word] = 0;
index[word]++;
return index;
}, {});
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
5439 次 |
| 最近记录: |