Bfc*_*fcm 2 php performance phpexcel
我正在使用PHPExcel库从Excel文件中读取数据.我的文件大约是5mb,70列和20000行.加载文件的代码是:
$sheetnames = array('Classification');
$excelFile = Yii::app()->basePath . '/categories/'. $region .'.xlsx';
$objReader = PHPExcel_IOFactory::createReader('Excel2007');
$objReader->setReadDataOnly(true);
$objReader->setLoadSheetsOnly($sheetnames);
$objPHPExcel = $objReader->load($excelFile);
Run Code Online (Sandbox Code Playgroud)
Excel文件具有以下结构:
Title | Id | Path | Attribute 1 | Attribute 2 | ... | Attribute 65
Run Code Online (Sandbox Code Playgroud)
加载此文件大约需要6分钟,占用过多的CPU和RAM.实际上,我需要知道具有给定ID的一行数据.现在我迭代所有行并检查id.这太低效了.
所以我有两个问题:
首先使用读取过滤器仅加载ID列:
/** Define a Read Filter class implementing PHPExcel_Reader_IReadFilter */
class SingleColumnFilter implements PHPExcel_Reader_IReadFilter
{
private $requestedColumn;
public function __construct($column) {
$this->requestedColumn = $column;
}
public function readCell($column, $row, $worksheetName = '') {
if ($column == $this->requestedColumn) {
return true;
}
return false;
}
}
/** Create an Instance of our Read Filter **/
$idColumnFilter = new SingleColumnFilter('B'); // Id is column B
$objReader = PHPExcel_IOFactory::createReader('Excel2007');
$objReader->setReadDataOnly(true);
$objReader->setLoadSheetsOnly($sheetnames);
/** Tell the Reader that we want to use the Read Filter **/
$objReader->setReadFilter($idColumnFilter);
/** Load only the column that matches our filter to PHPExcel **/
$objPHPExcel = $objReader->load($inputFileName);
Run Code Online (Sandbox Code Playgroud)
然后,PHPExcel将仅加载列中单元格的数据B.然后,您可以通过该子单元格搜索所需的值(1列和22,000行只有22,000个单元格,因此应该更接近35MB而不是加载整个文件所需的2.5GB),然后使用类似的基于行号过滤以仅加载您已识别的单行.
编辑
PHPExcel的最新1.8.1版本还有一个columnIterator,可以更容易地在列中迭代以查找特定的ID值:
$found = false;
foreach ($objPHPExcel->getActiveSheet()->getColumnIterator('B') as $column) {
$cellIterator = $column->getCellIterator();
$cellIterator->setIterateOnlyExistingCells(true);
foreach ($cellIterator as $key => $cell) {
if ($cell->getValue == 'ABC') {
$found = true;
$rowId = $cell->getRow()
break 2;
}
}
Run Code Online (Sandbox Code Playgroud)
编辑#2
一旦确定了所需的行,就可以使用第二个过滤器重新加载Excel文件......但只有那一行:
/** Define a Read Filter class implementing PHPExcel_Reader_IReadFilter */
class SingleRowFilter implements PHPExcel_Reader_IReadFilter
{
private $requestedRow;
public function __construct($row) {
$this->requestedRow = $row;
}
public function readCell($column, $row, $worksheetName = '') {
if ($row == $this->requestedRow) {
return true;
}
return false;
}
}
if ($found) {
/** Create an Instance of our Read Filter **/
$rowFilter = new SingleRowFilter($rowId);
$objReader2 = PHPExcel_IOFactory::createReader('Excel2007');
$objReader2->setReadDataOnly(true);
$objReader2->setLoadSheetsOnly($sheetnames);
/** Tell the Reader that we want to use the Read Filter **/
$objReader2->setReadFilter($rowFilter);
/** Load only the single row that matches our filter to PHPExcel **/
$objPHPExcel2 = $objReader2->load($inputFileName);
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
4897 次 |
| 最近记录: |