Most Common Mistakes PHP Developers Make
×

Most Common Mistakes PHP Developers Make

1325

Memory Usage Headfakes and Inefficiencies

While fetching many records at once is unquestionably more efficient than running a single query for each row to fetch, such an approach may result in an "out of memory" condition in libmysqlclient when using PHP’s mysql extension.

To validate, let’s take a look at a test box with limited resources (512MB RAM), MySQL, and php-cli.

For instance:

We'll bootstrap a database table like this:

// connect to mysql

$conn  =  new mysqli('localhost', 'username', 'password', 'dbname');

// create table of 400 columns

$qry = 'CREATE TABLE `test`(`id` INT NOT NULL PRIMARY KEY AUTO_INCREMENT';
for ($col = 0; $col < 400; $col++) {
    $qry .= ", `col$col` CHAR(10) NOT NULL";

}

$qry .= ');';
$conn->qry($qry);

// write 2 million rows

for ($row = 0; $row < 2000000; $row++) {
    $query = "INSERT INTO `test` VALUES ($row";
    for ($col = 0; $col < 400; $col++) {
        $qry .= ', ' . mt_rand(1000000000, 9999999999);
    }
    $qry .= ')';
    $conn->qry($qry);
}

OK, now let's check resources usage:

// connect to mysql
$connection = new mysqli('localhost', 'username', 'password', 'database');
echo "Before: " . memory_get_peak_usage() . "\n";

$res = $connection->query('SELECT `x`,`y` FROM `test` LIMIT 1');
echo "Limit 1: " . memory_get_peak_usage() . "\n";

$res = $connection->query('SELECT `x`,`y` FROM `test` LIMIT 10000');
echo "Limit 10000: " . memory_get_peak_usage() . "\n";

Output:

Before: 224704
Limit 1: 224704
Limit 10000: 224704


Wow, it appears that the query is successfully managed internally in terms of resources.

But, wait a minute, let’s not jump to a conclusion so soon. Just to be sure, let’s boost the limit one more time and this time lets let it to 100,000!


Well!
When we do that, we get-


| PHP Warning:  mysqli::query(): (HY000/2013):
|Lost connection to MySQL server during query in /root/test.php on line 11


So, what exactly happened?


Well, the issue is the way PHP’S MYSql module works that means it just act as a proxy for Libmysqlclient.

When a portion of data is selected, it hits the memory directly and as this memory is not managed by PHP’s manager, memory_get_peak_usage() won’t show any increase in resources utilization even when we scale the limits up in our query.


Now this is what exactly leads to the issue we just explained in above paragraph where we are duped into thinking that our memory management is fine However, our memory management is flawed, and we may encounter issues such as those described above.


We can avoid the above headfake (though it will not improve your memory utilisation) by using

the mysqlnd module. As mysqlnd is compiled as a native PHP extension and it does use PHP’s memory manager. Thus when we run the above test using mysqlnd rather than mysql, we get a much more realistic picture of our memory utilization:


Before: 232048
Limit 1: 324952
Limit 10000: 32572912


According to PHP documentation, mysql uses twice as many resources to store data as mysqlnd, so the original script that used mysql used even more memory than shown here (roughly twice as much). To solve such problems, consider limiting the size of your queries and using a loop with small number of iterations.

For instance:

$totalNumberToFetch = 10000;
$portionSize = 100;

for ($i = 0; $i <= ceil($totalNumberToFetch / $portionSize); $i++) {
    $limitFrom = $portionSize * $i;
    $res = $connection->query("SELECT `x`,`y` FROM `test` LIMIT $limitFrom, $portionSize");
}

When we consider both this PHP mistake and Performing queries in a loop, we see that there is a healthy balance that your code should strive for between being too granular and repetitive on the one hand, and having each of your individual queries be too large on the other.



Best WordPress Hosting


Share:


Discount Coupons

Get a .COM for just $6.98

Secure Domain for a Mini Price



Leave a Reply


Comments
    Waiting for your comments