使用PHP和MySQL ...我如何释放内存?

Ete*_*ine 5 php mysql performance memory-leaks database-performance

需求:

我们在两台服务器上有两个相似的表.服务器中的第一个表具有唯一的键列A,B,C,并且我们将Table1行插入到具有唯一键列B,C,D的Table2中.

表1具有大约5百万行,并且由于不同的唯一键列约束,表2将插入大约3百万行.

这里的要求是从Table1获取所有行,如果Table2中没有相同的记录,则插入Table2,如果记录匹配,则增加计数并更新Table2中的'cron_modified_date'列.

对于此设置,PHP版本为5.5,MySQL版本为5.7,而DB服务器具有6 GB RAM.

在执行下面的脚本时,处理2百万条记录并且RAM没有释放后处理速度变得非常慢,并且在某段时间之后所有RAM都被脚本占用并且在该脚本完全没有处理之后.

正如您所看到的,我正在重置变量并关闭数据库连接,但它不会释放数据库服务器RAM.经过一番阅读,我才知道,可能是PHP垃圾收集需要手动调用以释放资源,但它也没有释放RAM.

我在这里做错了什么以及如何使用PHP,MYSQL处理数百万条记录?

在执行脚本时释放RAM的任何其他方法,以便脚本应该执行竞争?

/* Fetch records count for batch insert*/

$queryCount = "SELECT count(*) as totalRecords FROM TABLE1 where created_date > = '2018-02-10'";
$rowsCount = $GLOBALS['db']->execRaw( $queryCount)->fetchAll();

$recordsPerIteration = 50000 ;
$totalCount = $rowsCount[0]['totalRecords']; 
$start = 0;

gc_disable() ;
if ( $totalCount > 0 ) {
    while ( $totalCount > 0 ) {
    $query = "SELECT *  FROM TABLE1
                WHERE where created_date > = '2018-02-10'
                ORDER BY suggestion_id DESC 
                LIMIT ".$start.",".$recordsPerIteration;

    print "sql is $query" ;

    $getAllRows = $GLOBALS['db']->execRaw( $query )->fetchAll();
    $GLOBALS['db']->queryString = null;
    $GLOBALS['db']->close() ;

    foreach ($getAllRows as  $getRow) {

        $insertRow  = " INSERT INTO TABLE2 (
                            Name,
                            Company,
                            ProductName,
                            Status,
                            cron_modified_date)
                VALUE (   
                            ".$GLOBALS['db_ab']->quote($getRow['Name']).", 
                            ".$GLOBALS['db_ab']->quote($getRow['Company']).", 
                            ".$GLOBALS['db_ab']->quote($getRow['ProductName']).",
                            ".$getRow['Status'].",
                            ".$GLOBALS['db_ab']->quote($getRow['created_date'])."
                        )
                    ON DUPLICATE KEY UPDATE count = (count + 1) , cron_modified_date =  '".$getRow['created_date']."'" ;

                $GLOBALS['db_ab']->execRaw( $insertRow ) ;
                $GLOBALS['db_ab']->queryString = null;
                $getRow = null;
                $insertRow = null;
                $GLOBALS['db_ab']->close() ;
           }
          gc_enable() ;
          $totalCount   = $totalCount- $recordsPerIteration;
          $start        += $recordsPerIteration ;
          $getAllRows = null;
          gc_collect_cycles() ;
    }

}
Run Code Online (Sandbox Code Playgroud)


在@ABelikov提供的建议和很少的命中和跟踪方法之后...最后下面的代码工作得非常好并且在每50K记录插入后释放RAM.

以下是主要发现

  • 在涉及大数据操作的每个主要操作之后释放DB连接变量并重新连接DB以便DB缓冲区刷新.
  • 对插入语句进行处理并一次执行插入操作.不要在循环中执行单个记录插入.

    谢谢你们提出宝贵的建议和帮助.

    /* Fetch records count for batch insert*/
    
    
    $queryCount = "SELECT count(*) as totalRecords FROM TABLE1 where created_date > = '2018-02-10'";
    $rowsCount = $GLOBALS['db']->execRaw( $queryCount)->fetchAll();
    
    $recordsPerIteration = 50000 ;
    $totalCount = $rowsCount[0]['totalRecords']; 
    $start = 0;
    
    if ( $totalCount > 0 ) {
       while ( $totalCount > 0 ) {
           $query = "SELECT *  FROM TABLE1
                WHERE where created_date > = '2018-02-10'
                ORDER BY suggestion_id DESC 
                LIMIT ".$start.",".$recordsPerIteration;
    
    print "sql is $query" ;
    
    $getAllRows = $GLOBALS['db']->execRaw( $query )->fetchAll();
    $GLOBALS['db']->queryString = null;
    $GLOBALS['db']->close() ;
    
    $insertRow  = " INSERT INTO TABLE2 (
                            Name,
                            Company,
                            ProductName,
                            Status,
                            cron_modified_date)
                VALUE (  " ;
    
    
    foreach ($getAllRows as  $getRow) {
    
    
            $insertRow  .= (".$GLOBALS['db_ab']->quote($getRow['Name']).", 
                            ".$GLOBALS['db_ab']->quote($getRow['Company']).", 
                            ".$GLOBALS['db_ab']->quote($getRow['ProductName']).",
                            ".$getRow['Status'].",
                            ".$GLOBALS['db_ab']->quote($getRow['created_date'])."),";
            }
    
    $insertRow=rtrim($insertRow,','); // Remove last ','
    $insertRow.= " ON DUPLICATE KEY UPDATE count = (count + 1) , cron_modified_date =  '".$getRow['created_date']."'" ;
    
    $GLOBALS['db_ab']->execRaw( $insertRow ) ;              
    //Flushing all data to freeup RAM
    $GLOBALS['db_ab'] = null ;
    $GLOBALS['db'] = null ;
    $insertRow = null;
    
    $totalCount = $totalCount- $recordsPerIteration;
    $start      += $recordsPerIteration ;
    $getAllRows = array();
    $getAllRows = null;
    print " \n Records needs to process ".$totalCount."\n";
    
    }
    
    }
    
    Run Code Online (Sandbox Code Playgroud)

ABe*_*kov 1

1.插入多行解决方案

您可以通过使用“插入多行”来加快脚本速度,请参阅此处https://dev.mysql.com/doc/refman/5.5/en/insert.html

INSERT INTO tbl_name (a,b,c) VALUES(1,2,3),(4,5,6),(7,8,9);

您只需要在 foreach 中保留 VALUES 部分,并移出所有其他部分

 $insertRow  = " INSERT INTO TABLE2 (
                             Name,
                             Company,
                             ProductName,
                             Status,
                             cron_modified_date) VALUES ";
 foreach ($getAllRows as  $getRow) {
     $insertRow.="(".$GLOBALS['db_ab']->quote($getRow['Name']).",
                   ".$GLOBALS['db_ab']->quote($getRow['Company']).", 
                   ".$GLOBALS['db_ab']->quote($getRow['ProductName']).",
                   ".$getRow['Status'].",
                   ".$GLOBALS['db_ab']->quote($getRow['created_date'])."),";

 }
 $insertRow=rtrim($insertRow,','); // Remove last ','
 $insertRow .= " ON DUPLICATE KEY UPDATE count = (count + 1) , cron_modified_date =  '".$getRow['created_date']."'" ;
 $GLOBALS['db_ab']->execRaw( $insertRow ) ;
 $GLOBALS['db_ab']->queryString = null;
 $getRow = null;
 $insertRow = null;
 $GLOBALS['db_ab']->close() ;
Run Code Online (Sandbox Code Playgroud)

仅当您的 foreach“主体”通常运行不止一次时,这才会有帮助

2.MySQL服务器端解决方案

尝试使用事务https://dev.mysql.com/doc/refman/5.7/en/commit.html http://php.net/manual/en/pdo.begintransaction.php

只需在脚本开始处开始并在结束时提交即可。取决于您的服务器,它确实可以提供帮助。不过要小心!这取决于您的 MySQL 服务器配置集。需要测试。