php内存限制垃圾收集器

3 days crashing my head towards a wall.

I developed a php script for import big text files and populate mysql database. Until i get 2 million records it works perfectly but i need to import like 10 million rows divided in different files.

My application scans files in a folder, get file extension (i have 4 kind of procedures import for 4 different extensions) and call the relative import function.

I have a structure made of theese classes:

CLASS SUBJECT1{ public function import_data_1(){
    __DESTRUCT(){$this->childObject = null;}
    IMPORT SUBJECT1(){
    //fopen($file);
    //ob_start();
    //PDO::BeginTransaction();
    //WHILE (FILE) {
    //PREPARED STATEMENT        
    //FILE READING
    //GET FILE LINE
    //EXECUTE INSERT
    //} END WHILE
    //PDO::Commit();
    //ob_clean(); or ob_flush();
    //fclose($file);
    //clearstatcache();
   }
};}

CLASS SUBJECT2{ same as SUBJECT1;}

CLASS SUBJECT3{ same as SUBJECT1;}

CLASS SUBJECT4{ same as SUBJECT1;}

and the main class that launches the procedure:

CLASS MAIN{
   switch($ext)
     case "ext1":
        $SUBJECT1 = new SUBJECT1();
        IMPORT_SUBJECT1();
        unset $SUBJECT1;
        $SUBJECT1 = null;
        break;
     case "ext2": //SAME AS CASE ext1 WITH IMPORT_SUBJECT2();
     case "ext3": //SAME AS CASE ext1 WITH IMPORT_SUBJECT3();
     case "ext4": //SAME AS CASE ext1 WITH IMPORT_SUBJECT4();

}

It works perfectly with some adjustement of mysql file buffers (ib_logfile0 and ib_logfile1 are set as 512Mb).

The problem is that everytime a procedure is terminated php does not free memory. I'm sure that destructor is called (i put an echo inside __destruct method) and the object is not accesible (var_dump say is NULL). I tried so many ways to free memory but now i'm at a dead point.

I also verified gc_collect_cycles() in many different point of code and it always says 0 cycles so all abject are not referenced each other. I tried even to delete class structure and call all the code sequential but i always get this error:

Fatal error: Out of memory (allocated 511180800) (tried to allocate 576 bytes) in C:\php\index.php on line 219 (line 219 is execute of a PS on the 13th file).

The memory is used in this way:

  • php script: 52MB
  • end first file import :110MB
  • destructors and unset calling: 110MB
  • new procedure calling: 110MB
  • end second file import 250MB
  • destructors and unset calling: 250MB
  • new procedure calling: 250MB

So as you can see even unsetting objects they don't free memory.

I tried setting php ini memory size to 1024M but it grows up really fast and crashes after 20 files.

Any advice?

Many thanks!

EDIT 1:

posting code:

class SUBJECT1{

    public function __destruct()
    {
        echo 'destroying subject1 <br/>';
    }

    public function import_subject1($file,$par1,$par2){
        global $pdo;

        $aux            = new AUX();
        $log            = new LOG();

// ---------------- FILES  ----------------
        $input_file    = fopen($file, "r");

// ---------------- PREPARED STATEMENT  ----------------

$PS_insert_data1= $pdo->prepare("INSERT INTO table (ID,PAR1,PAR2,PARN) VALUES (?,?,?,?) ON DUPLICATE KEY UPDATE ID = VALUES(ID), PAR1 = VALUES(PAR1), PAR2 = VALUES(PAR2), PAR3 = VALUES(PAR3), PARN = VALUES(PARN)");

$PS_insert_data2= $pdo->prepare("INSERT INTO table (ID,PAR1,PAR2,PARN) VALUES (?,?,?,?) ON DUPLICATE KEY UPDATE ID = VALUES(ID), PAR1 = VALUES(PAR1), PAR2 = VALUES(PAR2), PAR3 = VALUES(PAR3), PARN = VALUES(PARN)");

//IMPORT
if ($input_file) {
  ob_start();
  $pdo->beginTransaction();
  while (($line = fgets($input_file)) !== false) {
  $line = utf8_encode($line);
  $array_line = explode("|", $line);
  //set null values where i neeed
  $array_line = $aux->null_value($array_line);

  if(sizeof($array_line)>32){    
     if(!empty($array_line[25])){
          $PS_insert_data1->execute($array_line[0],$array_line[1],$array_line[2],$array_line[5]);
     }

  $PS_insert_data2->execute($array_line[10],$array_line[11],$array_line[12],$array_line[15]);
  }

$pdo->commit();    
flush();
ob_clean();
fclose($f_titolarita);
clearstatcache();
}

I do this iterative for all files of my folder, the other procedures are the same concept. I still have increase of memory and now it crashes with a white page response :-\

Personally, I would go slightly different about it. These are the steps I would do:

  • Open a PDO connection, set PDO in Exception mode
  • Get a list of files that I want to read
  • Create a class that can utilize PDO and the list of files and perform insertions
  • Prepare the statement ONCE, utilize it many times
  • Chunk PDO transaction commits to 50 (configurable) inserts - this means that every 50th time I call $stmt->execute(), I issue a commit - which utilizes the HDD better thus making it faster
  • Read each file line by line
  • Parse the line and check if it's valid
  • If yes, add to MySQL, if not - report an error

Now, I've created 2 classes and example on how I'd go about it. I tested only up to the reading part since I don't know your DB structure nor what AUX() does.

class ImportFiles
{
    protected $pdo;
    protected $statements;
    protected $transaction = false;
    protected $trx_flush_count = 50; // Commit the transaction at every 50 iterations

    public function __construct(PDO $pdo = null)
    {
        $this->pdo = $pdo;

        $this->stmt = $this->pdo->prepare("INSERT INTO table 
                                                (ID,PAR1,PAR2,PARN)
                                            VALUES 
                                                (?,?,?,?) 
                                            ON DUPLICATE KEY UPDATE ID = VALUES(ID), PAR1 = VALUES(PAR1), PAR2 = VALUES(PAR2), PAR3 = VALUES(PAR3), PARN = VALUES(PARN)");
    }

    public function import($file)
    {
        if($this->isReadable($file))
        {
            $file = new FileParser($file);

            $this->insert($file);
        }
        else
        {
            printf("
Specified file is not readable: %s", $file);
        }
    }

    protected function isReadable($file)
    {
        return (is_file($file) && is_readable($file));
    }   

    protected function insert(FileParser $file)
    {
        while($file->read())
        {
            //printf("
Line %d, value: %s", $file->getLineCount(), $file->getLine());

            $this->insertRecord($file);

            $this->flush($file);
        }

        $this->flush(null);
    }

    // Untested method, no idea whether it does its job or not - might fail
    protected function flush(FileParser $file = null)
    {
        if(!($file->getLineCount() % 50) && !is_null($file))
        {
            if($this->pdo->inTransaction())
            {
                $this->pdo->commit();

                $this->pdo->beginTransaction();
            }
        }
        else
        {
            if($this->pdo->inTransaction())
            {
                $this->pdo->commit();
            }
        }
    }   

    protected function insertRecord(FileParser $file)
    {
        $check_value = $file->getParsedLine(25);

        if(!empty($check_value))
        {
            $values = [ 
                $file->getParsedLine[0],
                $file->getParsedLine[1],
                $file->getParsedLine[2],
                $file->getParsedLine[5]
            ];
        }
        else
        {
            $values = [ 
                $file->getParsedLine[10],
                $file->getParsedLine[11],
                $file->getParsedLine[12],
                $file->getParsedLine[15]
            ];      
        }

        $this->stmt->execute($values);
    }
}

class FileParser
{
    protected $fh;
    protected $lineCount = 0;
    protected $line = null;
    protected $aux;

    public function __construct($file)
    {
        $this->fh = fopen($file, 'r');
    }

    public function read()
    {
        $this->line = fgets($this->fh);

        if($this->line !== false) $this->lineCount++;

        return $this->line;
    }

    public function getLineCount()
    {
        return $this->lineCount;
    }

    public function getLine()
    {
        return $this->line;
    }

    public function getParsedLine($index = null)
    {
        $line = $this->line;

        if(!is_null($line))
        {
            $line = utf8_encode($line);
            $array_line = explode("|", $line);

            //set null values where i neeed
            $aux = $this->getAUX();
            $array_line = $aux->null_value($array_line);

            if(sizeof($array_line) > 32)
            {   
                return is_null($index) ? $array_line : isset($array_line[$index]) ? $array_line[$index] : null;
            }
            else
            {
                throw new \Exception(sprintf("Invalid array size, expected > 32 got: %s", sizeof($array_line)));
            }
        }
        else
        {
            return [];
        }
    }

    protected function getAUX()
    {
        if(is_null($this->aux))
        {
            $this->aux = new AUX();
        }

        return $this->aux;
    }
}

Usage:

$dsn = 'mysql:dbname=testdb;host=127.0.0.1';
$user = 'dbuser';
$password = 'dbpass';

try 
{
    $pdo = new PDO($dsn, $user, $password);

    $pdo->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

    $import = new ImportFiles($pdo);

    $files = ['/usr/local/file1.txt', '/usr/local/file2.txt'];

    foreach($files as $file)
    {
        $import->import($file);
    }

} catch (Exception $e) 
{
    printf("
Error: %s", $e->getMessage());
    printf("
File: %s", $e->getFile());
    printf("
Line: %s", $e->getLine());
}

SOLVED:

i did this approach, maybe is useful for someone who has similar problem:

I opened task manager and looked at memory usage for apache and mysql processes with these cases:

  • Tried to read and elaborate files without calling MySql procedures (memory usage was ok)
  • Tried to read, elaborate and inserting in db just files with extension one by one (all .ext1, all .ext2, ....)
  • Debugged the procedure with big memory encreasing isolating functions one by one finding the problematic one.
  • Found the problem and solved

The problem was that i called a function passing as parameter the Prepared Statement. I thought that, once prepared, it was just a "static" object to call. What happens is that if you pass the same PS in a function the memory grows up exponentially.

Hope this helps to someone.

Bye!