13

I've just found something very strange in PHP.

If I pass in a variable to a function by reference, and then call a function on it, it's incredibly slow.

If you loop over the inner function call and the variable is large it can be many orders of magnitude slower than if the variable is passed by value.

Example:

<?php
function TestCount(&$aArray)
{
    $aArray = range(0, 100000);
    $fStartTime = microtime(true);

    for ($iIter = 0; $iIter < 1000; $iIter++)
    {
        $iCount = count($aArray);
    }

    $fTaken = microtime(true) - $fStartTime;

    print "took $fTaken seconds\n";
}

$aArray = array();
TestCount($aArray);
?>

This consistently takes about 20 seconds to run on my machine (on PHP 5.3).

But if I change the function to pass by value (ie function TestCount($aArray) instead of function TestCount(&$aArray)), then it runs in about 2ms - literally 10,000 times faster!

The same is true for other built-in functions such as strlen, and for user-defined functions.

What's going on?

John Carter
  • 52,342
  • 26
  • 107
  • 142
  • 3
    It is 10000 times slower because you are iterating inside the benchmark. This won't give you correct measure for `count()`. Use a profiler and you will see it's roughly 3 times slower only. For an explanation, see http://derickrethans.nl/talks/phparch-php-variables-article.pdf – Gordon Jun 25 '10 at 11:54
  • @Gordon - yes, true, but the reason we found this is that we had some production code that behaved very similarly to the example (changing the variable of course). It's not like it's a particularly esoteric use case. – John Carter Jun 25 '10 at 12:25
  • not saying it's esoteric. just saying the numbers are greatly exaggerated. – Gordon Jun 25 '10 at 12:51
  • @Gordon - I've edited the question a bit to mention looping over the inner function. – John Carter Jun 26 '10 at 14:51

3 Answers3

13

I found a bug report from 2005 that describes exactly this issue: http://bugs.php.net/bug.php?id=34540

So the problem seems to be that when passing a referenced value to a function that doesn't accept a reference, PHP needs to copy it.

This can be demonstrated with this test code:

<?php
function CalledFunc(&$aData)
{
    // Do nothing
}

function TestFunc(&$aArray)
{
    $aArray = range(0, 100000);
    $fStartTime = microtime(true);

    for ($iIter = 0; $iIter < 1000; $iIter++)
    {
        CalledFunc($aArray);
    }

    $fTaken = microtime(true) - $fStartTime;

    print "took $fTaken seconds\n";
}

$aArray = array();
TestFunc($sData);
?>

This runs quickly, but if you change function CalledFunc(&$aData) to function CalledFunc($aData) you'll see a similar slow-down to the count example.

This is rather worrying, since I've been coding PHP for quite a while and I had no idea about this issue.

Fortunately there's a simple workaround that is applicable in many cases - use a temporary local variable inside the loop, and copy to the reference variable at the end.

John Carter
  • 52,342
  • 26
  • 107
  • 142
  • 3
    true, but I think the behaviour (PHP clones the array) is correct and reasonable since we don't want the function that accepts the array as value to modify the original array if it is not cloned. can't live without it. maybe what we as programmers can do is to look out on this scenario and avoid it. – Lukman Jun 25 '10 at 13:53
  • It's indeed completely broken . just ran into that bug 5 minutes ago. w/e i'll just byref the strlen as well -- – Morg. Oct 01 '13 at 09:53
1

So, taking your answer already given, you can partially avoid this issue by forcing the copy before iterative work (Copying back afterward if the data is changed).

<?php
function TestCountNon($aArray)
{
    $aArray = range(0, 100000);
    $fStartTime = microtime(true);
    for ($iIter = 0; $iIter < 1000; $iIter++)
    {
        $iCount = count($aArray);
    }
    $fTaken = microtime(true) - $fStartTime;

    print "Non took $fTaken seconds\n<br>";
}

function TestCount(&$aArray)
{
    $aArray = range(0, 100000);
    $fStartTime = microtime(true);
    for ($iIter = 0; $iIter < 1000; $iIter++)
    {
        $iCount = count($aArray);
    }
    $fTaken = microtime(true) - $fStartTime;

    print "took $fTaken seconds\n<br>";
}

function TestCountA(&$aArray)
{
    $aArray = range(0, 100000);
    $fStartTime = microtime(true);
    $bArray = $aArray;
    for ($iIter = 0; $iIter < 1000; $iIter++)
    {
        $iCount = count($bArray);
    }
    $aArray = $bArray;
    $fTaken = microtime(true) - $fStartTime;

    print "A took $fTaken seconds\n<br>";
}

$nonArray = array();
TestCountNon($nonArray);

$aArray = array();
TestCount($aArray);

$bArray = array();
TestCountA($bArray);
?>

Results are:

Non took 0.00090217590332031 seconds 
took 17.676940917969 seconds 
A took 0.04144287109375 seconds 

Not quite as good, but a damn lot better.

Dan McGrath
  • 39,648
  • 10
  • 95
  • 126
0

This no longer (PHP 7.4.0) is an issue:

0.083219051 seconds (no ref)
0.090487003 seconds (ref)
0.091565132 seconds (ref+copy)

(Slightly larger arrays and iterations 10000000)

theking2
  • 1,498
  • 1
  • 20
  • 28