Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge memory leak when using CommandPool #1699

Closed
html5maker opened this issue Dec 24, 2018 · 5 comments
Closed

Huge memory leak when using CommandPool #1699

html5maker opened this issue Dec 24, 2018 · 5 comments
Labels
duplicate This issue is a duplicate.

Comments

@html5maker
Copy link

CommandPool is using too much memory (memory leak).
Tested with official Google php docker image (7.0, 7.1, 7.2)

Steps to reproduce

<?php
    if (!shell_exec('which composer')) {
        passthru('curl https://getcomposer.org/composer.phar > /usr/bin/composer && chmod a+x /usr/bin/composer');
        passthru('apt update && apt install -y git');
    }

    passthru('composer require aws/aws-sdk-php');

    require_once 'vendor/autoload.php';

    $mem1 = meminfo();

    $s3_client = new \Aws\S3\S3Client([
        'credentials' => [
            'key' => 'XXXXXXXXXXXXXXXX',
            'secret' => 'XXXXXXXXXXXXXXXX',
        ],
        'region' => 'XXXXXXXXXXXXXXXX',
        'version' => '2006-03-01'
    ]);

    $commands = [];
    foreach (range(1, 100) as $tmp) {
        $commands[] = $s3_client->getCommand('PutObject', 'XXXXXXXXXXXXXXXX'), [
            'Body' => str_random(1024),
            'ContentType' => 'text/plain'
        ]));
    }

    $errors = [];
    $pool = new \Aws\CommandPool($s3_client, $commands, [
        'concurrency' => 10,
        'fulfilled' => function (\Aws\ResultInterface $result, $index) {
        },
        'rejected' => function ($error, $index) use (&$errors) {
            $errors[] = ['error' => $error, 'index' => $index];
        }
    ]);
    $pool->promise()->wait();

    $mem2 = meminfo();
    echo sprintf("consumed memory: %s\n", format_bytes($mem1['free'] - $mem2['free']));

    function meminfo()
    {
        preg_match('/^MemFree:\s*(\d+)/m', file_get_contents('/proc/meminfo'), $free);
        preg_match('/^MemAvailable:\s*(\d+)/m', file_get_contents('/proc/meminfo'), $available);
        $free = $free[1]*1024;
        $available = $available[1]*1024;
        return compact('free', 'available');
    }

    function format_bytes($bytes)
    {
        if ($bytes < 1000) {
            return number_format($bytes, 2);
        }
        $kilo = $bytes/1024;
        if ($kilo < 1000) {
            return number_format($kilo, 2) . 'K';
        }
        $mega = $kilo/1024;
        if ($mega < 1000) {
            return number_format($mega, 2) . 'M';
        }
        $giga = $mega/1024;
        if ($giga < 1000) {
            return number_format($giga, 2) . 'G';
        }
        return number_format($giga/1024, 2) . 'T';
    }

--- end ---

$ docker run --rm -v $PWD/index.php:/app/index.php:ro gcr.io/google-appengine/php70 php /app/index.php
[...]
consumed memory: 1.94G

$ docker run --rm -v $PWD/index.php:/app/index.php:ro gcr.io/google-appengine/php71 php /app/index.php
[...]
consumed memory: 1.80G

$ docker run --rm -v $PWD/index.php:/app/index.php:ro gcr.io/google-appengine/php72 php /app/index.php
[...]
consumed memory: 1.93G
@diehlaws diehlaws self-assigned this Jan 4, 2019
@diehlaws diehlaws added the guidance Question that needs advice or information. label Jan 5, 2019
@diehlaws
Copy link
Contributor

diehlaws commented Jan 5, 2019

Hi @html5maker, thanks for reaching out to us. Unfortunately I haven't been able to reproduce the behavior you're describing. I replaced str_random(1024) in your code snippet with random_bytes(1024) to avoid the need for Laravel, and ran that on Docker containers using the images you mention. I'm seeing around 200MB of memory usage for these. I also tested increasing the amount of PutObject commands to 1,000 and 10,000 to check for a significant increase in memory usage as the number of commands being added to the CommandPool increases, and only saw a slight increase that does not appear to correlate to any memory leak.

$ docker run --rm -v $PWD:/app/:ro gcr.io/google-appengine/php70 php /app/index100.php
consumed memory: 203.39M

$ docker run --rm -v $PWD:/app/:ro gcr.io/google-appengine/php70 php /app/index1000.php
consumed memory: 209.38M

$ docker run --rm -v $PWD:/app/:ro gcr.io/google-appengine/php70 php /app/index10000.php
consumed memory: 225.52M

$ docker run --rm -v $PWD:/app/:ro gcr.io/google-appengine/php71 php /app/index100.php
consumed memory: 204.00M

$ docker run --rm -v $PWD:/app/:ro gcr.io/google-appengine/php71 php /app/index1000.php
consumed memory: 210.73M

$ docker run --rm -v $PWD:/app/:ro gcr.io/google-appengine/php71 php /app/index10000.php
consumed memory: 253.21M

$ docker run --rm -v $PWD:/app/:ro gcr.io/google-appengine/php72 php /app/index100.php 
consumed memory: 203.57M

$ docker run --rm -v $PWD:/app/:ro gcr.io/google-appengine/php72 php /app/index1000.php
consumed memory: 207.53M

$ docker run --rm -v $PWD:/app/:ro gcr.io/google-appengine/php72 php /app/index10000.php
consumed memory: 240.33M

In addition to this I configured Laravel on a custom Docker container to use str_random instead of random_bytes for the sake of testing this to completion and saw that it only used 15MB of memory:

# php artisan php1699hundred
consumed memory: 15.53M

# php artisan php1699thousand
consumed memory: 15.39M

# php artisan php1699tenthou
consumed memory: 14.84M

Is it possible that there's something else in your environment that could be affecting the reported memory usage?

@diehlaws diehlaws added the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jan 5, 2019
@vbarbarosh
Copy link

The problem still exists. But it seems to be in guzzlehttp/guzzle. It happens when using hight number (e.g. 50) for cuncurrency and only in google-appengine/php* images.

GoogleCloudPlatform/php-docker#468

You can reproduce it in the following way:

$ cat > index.php
<?php

if (!shell_exec('which composer')) {
    passthru('curl https://getcomposer.org/composer.phar > /usr/bin/composer && chmod a+x /usr/bin/composer');
    passthru('apt update && apt install -y git');
}

passthru('composer require aws/aws-sdk-php');

require_once 'vendor/autoload.php';

$mem1 = meminfo();

$s3_client = new \Aws\S3\S3Client([
    'credentials' => [
        'key' => 'xxxxxxxxxx',
        'secret' => 'xxxxxxxxxx',
    ],
    'region' => 'us-east-1',
    'version' => '2006-03-01'
]);

$commands = [];
foreach (range(1, 200) as $tmp) {
    $commands[] = $s3_client->getCommand('PutObject', [
        'Bucket' => 'xxxxxxxxxx',
        'Key' => "tmp/guzzle-issue/$tmp.txt",
        'Body' => $tmp,
        'ContentType' => 'text/plain'
    ]);
}

$pool = new \Aws\CommandPool($s3_client, $commands, ['concurrency' => 100]);
$pool->promise()->wait();

$mem2 = meminfo();
echo sprintf("consumed memory: %s\n", format_bytes($mem1['free'] - $mem2['free']));

function meminfo()
{
    preg_match('/^MemFree:\s*(\d+)/m', file_get_contents('/proc/meminfo'), $free);
    preg_match('/^MemAvailable:\s*(\d+)/m', file_get_contents('/proc/meminfo'), $available);
    $free = $free[1]*1024;
    $available = $available[1]*1024;
    return compact('free', 'available');
}

function format_bytes($bytes)
{
    if ($bytes < 1000) {
        return number_format($bytes, 2);
    }
    $kilo = $bytes/1024;
    if ($kilo < 1000) {
        return number_format($kilo, 2) . 'K';
    }
    $mega = $kilo/1024;
    if ($mega < 1000) {
        return number_format($mega, 2) . 'M';
    }
    $giga = $mega/1024;
    if ($giga < 1000) {
        return number_format($giga, 2) . 'G';
    }
    return number_format($giga/1024, 2) . 'T';
}

---end---

$ docker run --rm -v $PWD/index.php:/app/index.php:ro gcr.io/google-appengine/php70 php /app/index.php
[...]
consumed memory: 1.53G

$ docker run --rm -v $PWD/index.php:/app/index.php:ro gcr.io/google-appengine/php71 php /app/index.php
[...]
consumed memory: 1.83G

$ docker run --rm -v $PWD/index.php:/app/index.php:ro gcr.io/google-appengine/php72 php /app/index.php
[...]
consumed memory: 1.81G

$ docker run --rm -v $PWD/index.php:/app/index.php:ro php:7.0 php /app/index.php
[...]
consumed memory: 107.22M

$ docker run --rm -v $PWD/index.php:/app/index.php:ro php:7.1 php /app/index.php
[...]
consumed memory: 109.87M

$ docker run --rm -v $PWD/index.php:/app/index.php:ro php:7.2 php /app/index.php
[...]
consumed memory: 103.11M

$ docker run --rm -v $PWD/index.php:/app/index.php:ro php:7.3 php /app/index.php
[...]
consumed memory: 108.81M

@diehlaws diehlaws removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Jan 17, 2019
@diehlaws diehlaws added duplicate This issue is a duplicate. and removed guidance Question that needs advice or information. labels Mar 15, 2019
@diehlaws
Copy link
Contributor

@html5maker @vbarbarosh We are investigating improvements in memory management of the AWS SDK for PHP and will be tracking progress on this in #1273. Please don't hesitate to contribute to the discussion on that issue with your findings relating to this behavior.

@diehlaws diehlaws removed their assignment Aug 26, 2020
@aknowicki
Copy link

I experienced similar issues... I hit the memory limit however when I divided into 1000 command chunks it executed smoothly (max 300MB top memory usage vs. 3GB):

            foreach (array_chunk($batchAll, 1000) as $batch) {
                $results = CommandPool::batch($s3, $batch,  [
                    'concurrency' => 4,
                    'before' => function (CommandInterface $cmd, $iterKey) {
                        gc_collect_cycles();
                    }
                ]);
            }

@kaktusas2598
Copy link

Can we reopen this one? We have experienced almost double increase of memory usage when fetching S3 Objects using CommandPool from PHP 8.0 to 8.1, this is really not acceptable in production system

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue is a duplicate.
Projects
None yet
Development

No branches or pull requests

5 participants