如何检查PHP中两个字符串的部分相似性

Ily*_*bin 15 php string

PHP中的任何函数都检查两个字符串的相似度百分比吗?

例如我有:

$string1="Hello how are you doing" 
$string2= " hi, how are you"
Run Code Online (Sandbox Code Playgroud)

并且这function($string1, $string2)将使我真实,因为"如何","是","你"这些词存在于该行中.

或者甚至更好,给我60%的相似度,因为"如何","是","你"是3/5 $string1.

PHP中是否存在任何功能?

Hug*_*ing 31

因为这是一个很好的问题,我付出了一些努力:

<?php
$string1="Hello how are you doing";
$string2= " hi, how are you";

echo 'Compare result: ' . compareStrings($string1, $string2) . '%';
//60%


function compareStrings($s1, $s2) {
    //one is empty, so no result
    if (strlen($s1)==0 || strlen($s2)==0) {
        return 0;
    }

    //replace none alphanumeric charactors
    //i left - in case its used to combine words
    $s1clean = preg_replace("/[^A-Za-z0-9-]/", ' ', $s1);
    $s2clean = preg_replace("/[^A-Za-z0-9-]/", ' ', $s2);

    //remove double spaces
    while (strpos($s1clean, "  ")!==false) {
        $s1clean = str_replace("  ", " ", $s1clean);
    }
    while (strpos($s2clean, "  ")!==false) {
        $s2clean = str_replace("  ", " ", $s2clean);
    }

    //create arrays
    $ar1 = explode(" ",$s1clean);
    $ar2 = explode(" ",$s2clean);
    $l1 = count($ar1);
    $l2 = count($ar2);

    //flip the arrays if needed so ar1 is always largest.
    if ($l2>$l1) {
        $t = $ar2;
        $ar2 = $ar1;
        $ar1 = $t;
    }

    //flip array 2, to make the words the keys
    $ar2 = array_flip($ar2);


    $maxwords = max($l1, $l2);
    $matches = 0;

    //find matching words
    foreach($ar1 as $word) {
        if (array_key_exists($word, $ar2))
            $matches++;
    }

    return ($matches / $maxwords) * 100;    
}
?>
Run Code Online (Sandbox Code Playgroud)

  • 最后一个答案没有无用(在这种情况下)`similar_text`.+1 (8认同)
  • 两种方法都有优点和缺点.但请记住,您要求使用类似的单词.测试`$ string1 ="单词超酷"; $ string2 ="超酷的单词";`我的100%匹配和相似的文字只有62%.它是一个什么以及如何检查它的问题.类似的文本检查字符串中匹配的最长部分.所以单词/字母的正确顺序也是如此.当两个或更多人试图用他们自己的话说同样的话时,这种情况会发生很大的变化. (2认同)

Ale*_*iri 9

正如其他答案已经说过的那样,您可以使用similar_text.这是演示:

$string1="Hello how are you doing" ;
$string2= " hi, how are you";

echo similar_text($string1, $string2, $perc); //12

echo $perc; //61.538461538462
Run Code Online (Sandbox Code Playgroud)

将返回12,并将按$ perc设置您所要求的相似度百分比.


Raf*_*shi 8

除了 Alex Siri 的回答,根据以下文章:

http://docstore.mik.ua/orelly/webprog/php/ch04_06.htm

PHP 提供了几个函数来测试两个字符串是否近似相等:

$string1="Hello how are you doing" ;
$string2= " hi, how are you";
Run Code Online (Sandbox Code Playgroud)

SOUNDEX

if (soundex($string1) == soundex($string2)) {

  echo "similar";

} else {

  echo "not similar";

}
Run Code Online (Sandbox Code Playgroud)

转接电话

if (metaphone($string1) == metaphone($string2)) {

   echo "similar";

} else {

  echo "not similar";

}
Run Code Online (Sandbox Code Playgroud)

相似的文字

$similarity = similar_text($string1, $string2);
Run Code Online (Sandbox Code Playgroud)

莱文斯坦

$distance = levenshtein($string1, $string2); 
Run Code Online (Sandbox Code Playgroud)