我正在尝试在PHP中编写一个函数,它接受一个字符串数组(needle)并执行与另一个字符串数组(haystack)的比较.此函数的目的是为AJAX搜索快速提供匹配的字符串,因此需要尽可能快.
这里有一些示例代码来说明这两个数组;
$needle = array('ba','hot','resta');
$haystack = array(
'Southern Hotel',
'Grange Restaurant & Hotel',
'Austral Hotel',
'Barsmith Hotel',
'Errestas'
);
Run Code Online (Sandbox Code Playgroud)
虽然这本身很容易,但比较的目的是计算有多少needle字符串出现在haystack.
但是,有三个限制;
needle必须只在单词的开头的字符匹配.例如,"hote"将匹配"Hotel",但"resta"将不匹配"Errestas".needles的数量,而不是needle出现次数.如果一个地方被命名为"酒店宾馆酒店",我们需要的结果1不是3.使用上面的例子,我们期望得到以下关联数组:
$haystack = array(
'Southern Hotel' => 1,
'Grange Restaurant & Hotel' => 2,
'Austral Hotel' => 1,
'Barsmith Hotel' => 2,
'Erresta' => 0
);
Run Code Online (Sandbox Code Playgroud)
我一直在尝试实现一个函数来执行此操作,使用一个preg_match_all()看起来像的正则表达式/(\A|\s)(ba|hot|resta)/.虽然这确保我们只匹配单词的开头,但它没有考虑包含相同needle两次的字符串.
我发帖看看别人是否有解决方案?
我发现你对问题的描述足够详细,我可以采用TDD方法来解决它.因此,因为我非常想成为一名TDD人,所以我编写了测试和函数来使测试通过.Namings可能并不完美,但它们很容易改变.函数的算法也可能不是最好的,但是现在有了测试,重构应该非常简单和轻松.
class MultiMatcherTest extends PHPUnit_Framework_TestCase
{
public function testTheComparisonIsCaseInsensitive()
{
$needles = array('hot');
$haystack = array('Southern Hotel');
$result = match($needles, $haystack);
$this->assertEquals(array('Southern Hotel' => 1), $result);
}
public function testNeedleMatchesOnlyCharsAtBeginningOfWord()
{
$needles = array('resta');
$haystack = array('Errestas');
$result = match($needles, $haystack);
$this->assertEquals(array('Errestas' => 0), $result);
}
public function testMatcherCountsNeedlesNotOccurences()
{
$needles = array('hot');
$haystack = array('Southern Hotel', 'Grange Restaurant & Hotel');
$expected = array('Southern Hotel' => 1,
'Grange Restaurant & Hotel' => 1);
$result = match($needles, $haystack);
$this->assertEquals($expected, $result);
}
public function testAcceptance()
{
$needles = array('ba','hot','resta');
$haystack = array(
'Southern Hotel',
'Grange Restaurant & Hotel',
'Austral Hotel',
'Barsmith Hotel',
'Errestas',
);
$expected = array(
'Southern Hotel' => 1,
'Grange Restaurant & Hotel' => 2,
'Austral Hotel' => 1,
'Barsmith Hotel' => 2,
'Errestas' => 0,
);
$result = match($needles, $haystack);
$this->assertEquals($expected, $result);
}
}
Run Code Online (Sandbox Code Playgroud)
function match($needles, $haystack)
{
// The default result will containg 0 (zero) occurences for all $haystacks
$result = array_combine($haystack, array_fill(0, count($haystack), 0));
foreach ($needles as $needle) {
foreach ($haystack as $subject) {
$words = str_word_count($subject, 1); // split into words
foreach ($words as $word) {
if (stripos($word, $needle) === 0) {
$result[$subject]++;
break;
}
}
}
}
return $result;
}
Run Code Online (Sandbox Code Playgroud)
break声明是否必要以下测试显示何时break需要.break在match函数内部使用和不使用语句运行此测试.
/**
* This test demonstrates the purpose of the BREAK statement in the
* implementation function. Without it, the needle will be matched twice.
* "hot" will be matched for each "Hotel" word.
*/
public function testMatcherCountsNeedlesNotOccurences2()
{
$needles = array('hot');
$haystack = array('Southern Hotel Hotel');
$expected = array('Southern Hotel Hotel' => 1);
$result = match($needles, $haystack);
$this->assertEquals($expected, $result);
}
Run Code Online (Sandbox Code Playgroud)