我有一个800mb文本文件,其中包含18,990,870行(每行是一条记录),我需要选择某些记录,如果有匹配则将它们写入数据库.
它花了一个时间来完成它们,所以我想知道是否有办法更快地完成它?
我的PHP一次读取一行,如下所示:
$fp2 = fopen('download/pricing20100714/application_price','r');
if (!$fp2) {echo 'ERROR: Unable to open file.'; exit;}
while (!feof($fp2)) {
$line = stream_get_line($fp2,128,$eoldelimiter); //use 2048 if very long lines
if ($line[0] === '#') continue; //Skip lines that start with #
$field = explode ($delimiter, $line);
list($export_date, $application_id, $retail_price, $currency_code, $storefront_id ) = explode($delimiter, $line);
if ($currency_code == 'USD' and $storefront_id == '143441'){
// does application_id exist?
$application_id = mysql_real_escape_string($application_id);
$query = "SELECT * FROM jos_mt_links WHERE link_id='$application_id';";
$res = mysql_query($query);
if (mysql_num_rows($res) > 0 ) {
echo $application_id . "application id has price of " . $retail_price . "with currency of " . $currency_code. "\n";
} // end if exists in SQL
} else
{
// no, application_id doesn't exist
} // end check for currency and storefront
} // end while statement
fclose($fp2);
Run Code Online (Sandbox Code Playgroud)
猜测,性能问题是因为它为每个带有USD和店面的application_id发出查询.
如果空间和IO不是问题,您可能只是盲目地将所有19M记录写入新的临时数据库表,添加索引然后与过滤器匹配?
归档时间: |
|
查看次数: |
537 次 |
最近记录: |