Google 表格中的模糊匹配

clu*_*o87 3 regex if-statement google-sheets google-apps-script google-sheets-formula

尝试将 GoogleSheets 中的两列与 C 列中的公式进行比较:

=if(A1=B1,"","Mismatch")

工作正常,但我收到很多误报:

A。 C
玛丽·乔 玛丽·乔
杰伊·蒂姆 蒂姆·杰伊 不匹配
萨姆·罗恩 萨姆·罗恩 不匹配
杰克*马 马云 不匹配

有什么想法如何运作吗?

fir*_*ast 5

这使用基于分数的方法来确定匹配。您可以根据该分数确定匹配/不匹配:

在此输入图像描述

Score Formula = getMatchScore(A1,B1)
Match Formula = if(C1<.7,"mismatch",)
Run Code Online (Sandbox Code Playgroud)
function getMatchScore(strA, strB, ignoreCase=true) {
  strA = String(strA);
  strB = String(strB)
  const toLowerCase = ignoreCase ? str => str.toLowerCase() : str => str;
  const splitWords = str => str.split(/\b/);
  let [maxLenStr, minLenStr] = strA.length > strB.length ? [strA, strB] : [strB, strA]; 
  
  maxLenStr = toLowerCase(maxLenStr);
  minLenStr = toLowerCase(minLenStr);

  const maxLength = maxLenStr.length;
  const minLength = minLenStr.length;
  const lenScore = minLength / maxLength;

  const orderScore = Array.from(maxLenStr).reduce(
    (oldItem, nItem, index) => nItem === minLenStr[index] ? oldItem + 1 : oldItem, 0
  ) / maxLength;

  const maxKeyWords = splitWords(maxLenStr);
  const minKeyWords = splitWords(minLenStr);

  const keywordScore = minKeyWords.reduce(({ score, searchWord }, nItem) => {
    const newSearchWord = searchWord?.replace(new RegExp(nItem, ignoreCase ? 'i' : ''), '');
    score += searchWord.length != newSearchWord.length ? 1: 0;

    return { score, searchWord: newSearchWord };
  }, { score: 0, searchWord: maxLenStr }).score / minKeyWords.length;

  const sortedMaxLenStr = Array.from(maxKeyWords.sort().join(''));
  const sortedMinLenStr = Array.from(minKeyWords.sort().join(''));

  const charScore = sortedMaxLenStr.reduce((oldItem, nItem, index) => { 
    const surroundingChars = [sortedMinLenStr[index-1], sortedMinLenStr[index], sortedMinLenStr[index+1]]
    .filter(char => char != undefined);
    
    return surroundingChars.includes(nItem)? oldItem + 1 : oldItem
  }, 0) / maxLength;

  const score = (lenScore * .15) + (orderScore * .25) + (charScore * .25) + (keywordScore * .35);

  return score;
}
Run Code Online (Sandbox Code Playgroud)