什么是在PHP中的排序数组中插入元素的更好方法

pio*_*kkr 12 php arrays performance insert sorted

我最近将我的简历发送给一家雇用PHP开发人员的公司.如果我有足够的经验,他们会给我发回一个要解决的任务.

任务是这样的:

你有一个10k独特元素的数组,排序后代.写入生成此数组的函数,然后编写三个不同的函数,将新元素插入到数组中,插入数组之后仍将按顺序排序.编写一些代码来测量这些功能的速度.您不能使用PHP排序功能.

所以我编写了函数来生成数组和四个函数来将新元素插入到数组中.

/********** Generating array (because use of range() was to simple :)): *************/

function generateSortedArray($start = 300000, $elementsNum = 10000, $dev = 30){
    $arr = array();
    for($i = 1; $i <= $elementsNum; $i++){
        $rand = mt_rand(1, $dev);
        $start -= $rand;
        $arr[] = $start; 
    }
    return $arr;
}

/********************** Four insert functions: **************************/

// for loop, and array copying
function insert1(&$arr, $elem){
    if(empty($arr)){
        $arr[] = $elem;
        return true;
    }
    $c = count($arr);
    $lastIndex = $c - 1;
    $tmp = array();
    $inserted = false;
    for($i = 0; $i < $c; $i++){
        if(!$inserted && $arr[$i] <= $elem){
            $tmp[] = $elem;
            $inserted = true;
        }
        $tmp[] = $arr[$i];
        if($lastIndex == $i && !$inserted) $tmp[] = $elem;
    }
    $arr = $tmp;
    return true;
}

// new element inserted at the end of array 
// and moved up until correct place
function insert2(&$arr, $elem){
    $c = count($arr);
    array_push($arr, $elem);
    for($i = $c; $i > 0; $i--){
        if($arr[$i - 1] >= $arr[$i]) break;
        $tmp = $arr[$i - 1];
        $arr[$i - 1] = $arr[$i];
        $arr[$i] = $tmp;
    }
    return true;
}

// binary search for correct place + array_splice() to insert element
function insert3(&$arr, $elem){
    $startIndex = 0;
    $stopIndex = count($arr) - 1;
    $middle = 0;
    while($startIndex < $stopIndex){
        $middle = ceil(($stopIndex + $startIndex) / 2);
        if($elem > $arr[$middle]){
            $stopIndex = $middle - 1;
        }else if($elem <= $arr[$middle]){
            $startIndex = $middle;
        }
    }
    $offset = $elem >= $arr[$startIndex] ? $startIndex : $startIndex + 1; 
    array_splice($arr, $offset, 0, array($elem));
}

// for loop to find correct place + array_splice() to insert
function insert4(&$arr, $elem){
    $c = count($arr);
    $inserted = false;
    for($i = 0; $i < $c; $i++){
        if($elem >= $arr[$i]){
            array_splice($arr, $i, 0, array($elem));
            $inserted = true;
            break;
        }
    }
    if(!$inserted) $arr[] = $elem;
    return true;
}

/*********************** Speed tests: *************************/

// check if array is sorted descending
function checkIfArrayCorrect($arr, $expectedCount = null){
    $c = count($arr);
    if(isset($expectedCount) && $c != $expectedCount) return false;
    $correct = true;
    for($i = 0; $i < $c - 1; $i++){
        if(!isset($arr[$i + 1]) || $arr[$i] < $arr[$i + 1]){
            $correct = false;
            break;
        }
    }
    return $correct;
}
// claculates microtimetime diff
function timeDiff($startTime){
    $diff = microtime(true) - $startTime;
    return $diff;
}
// prints formatted execution time info
function showTime($func, $time){
    printf("Execution time of %s(): %01.7f s\n", $func, $time);
}
// generated elements num
$elementsNum = 10000;
// generate starting point
$start = 300000;
// generated elements random range    1 - $dev
$dev = 50;


echo "Generating array with descending order, $elementsNum elements, begining from $start\n";
$startTime = microtime(true);
$arr = generateSortedArray($start, $elementsNum, $dev);
showTime('generateSortedArray', timeDiff($startTime));

$step = 2;
echo "Generating second array using range range(), $elementsNum elements, begining from $start, step $step\n";
$startTime = microtime(true);
$arr2 = range($start, $start - $elementsNum * $step, $step);
showTime('range', timeDiff($startTime));

echo "Checking if array is correct\n";
$startTime = microtime(true);
$sorted = checkIfArrayCorrect($arr, $elementsNum);
showTime('checkIfArrayCorrect', timeDiff($startTime));

if(!$sorted) die("Array is not in descending order!\n");
echo "Array OK\n";

$toInsert = array();

// number of elements to insert from every range
$randElementNum = 20;

// some ranges of elements to insert near begining, middle and end of generated array
// start value => end value
$ranges = array(
    300000 => 280000,
    160000 => 140000,
    30000 => 0,
);
foreach($ranges as $from => $to){
    $values = array();
    echo "Generating $randElementNum random elements from range [$from - $to] to insert\n";
    while(count($values) < $randElementNum){
        $values[mt_rand($from, $to)] = 1;
    }
    $toInsert = array_merge($toInsert, array_keys($values));
}
// some elements to insert on begining and end of array
array_push($toInsert, 310000);
array_push($toInsert, -1000);

echo "Generated elements: \n";
for($i = 0; $i < count($toInsert); $i++){
    if($i > 0 && $i % 5 == 0) echo "\n";
    printf("%8d, ", $toInsert[$i]);
    if($i == count($toInsert) - 1) echo "\n";
}
// functions to test
$toTest = array('insert1' => null, 'insert2' => null, 'insert3' => null, 'insert4' => null);
foreach($toTest as $func => &$time){
    echo "\n\n================== Testing speed of $func() ======================\n\n";
    $tmpArr = $arr;
    $startTime = microtime(true);
    for($i = 0; $i < count($toInsert); $i++){
        $func($tmpArr, $toInsert[$i]);
    }
    $time = timeDiff($startTime, 'checkIfArraySorted');
    showTime($func, $time);
    echo "Checking if after using $func() array is still correct: \n";
    if(!checkIfArrayCorrect($tmpArr, count($arr) + count($toInsert))){
        echo "Array INCORRECT!\n\n";
    }else{
        echo "Array OK!\n\n";
    }
    echo "Few elements from begining of array:\n";
    print_r(array_slice($tmpArr, 0, 5));

    echo "Few elements from end of array:\n";
    print_r(array_slice($tmpArr, -5));

    //echo "\n================== Finished testing $func() ======================\n\n";
}
echo "\n\n================== Functions time summary    ======================\n\n";
print_r($toTest);
Run Code Online (Sandbox Code Playgroud)

结果可以在这里找到:http://ideone.com/1xQ3T

不幸的是,我在这个任务中只被评为30分中的13分(不知道它是如何计算的或者考虑到了什么).我只能假设这是因为有更好的方法在PHP中将新元素插入到排序数组中.我现在正在搜索这个话题一段时间,但找不到任何好的东西.Maby你知道更好的方法或有关该主题的一些文章吗?

我的本地主机(PHP 5.3.6-13ubuntu3.6与Suhosin-Patch,AMD Athlon(TM)II X4 620)insert2()的速度最快,但在ideone(PHP 5.2.11)insert3()上最快.有什么想法吗?我想array_splice()以某种方式调整:).

//编辑

昨天我又考虑了一下,并想出了更好的插入方法.如果你只需要排序结构和迭代它的方法而你主要关心的是插入操作的速度,那么最好的选择就是使用SplMaxHeap类.在SplMaxHeap类中,插入很快:)我修改了我的脚本以显示插入的速度.代码在这里:http://ideone.com/vfX98(ideone有php 5.2所以不会有SplMaxHeap类)

在我的localhost上,我得到的结果如下:

================== Functions time summary  ======================

           insert1() => 0.5983521938
           insert2() => 0.2605950832
           insert3() => 0.3288729191
           insert4() => 0.3288729191
SplMaxHeap::insert() => 0.0000801086
Run Code Online (Sandbox Code Playgroud)

Mad*_*iha 4

可能只是我,但也许他们也在寻找可读性和可维护性?

我的意思是,您正在命名变量$arr、 和$c$middle甚至不费心放置适当的文档。

例子:

/**
 * generateSortedArray()    Function to generate a descending sorted array
 *
 * @param int $start        Beginning with this number
 * @param int $elementsNum  Number of elements in array
 * @param int $dev          Maximum difference between elements
 * @return array            Sorted descending array.
 */
function generateSortedArray($start = 300000, $elementsNum = 10000, $dev = 30) {
    $arr = array();                             #Variable definition
    for ($i = 1; $i <= $elementsNum; $i++) {    
        $rand = mt_rand(1, $dev);               #Generate a random number
        $start -= $rand;                        #Substract from initial value
        $arr[] = $start;                        #Push to array
    }
    return $arr;
}
Run Code Online (Sandbox Code Playgroud)