Mus*_*afa 7 php html-table html-parsing simple-html-dom
有这个表我想加载到一个多维数组.问题是由于表具有rowspan值,每行可能具有不同的单元格数.所以我必须删除rowspan并添加空值而不是这些单元格.
这是我有的表(原始文件)(有5k行.)
我必须像这样添加这个表,以便有一个合适的数组.
删除第一行的colspan值很容易.但是在当前方法中删除rowspans有时会导致数组中的额外值.
我目前的PHP文件:
<?php
ini_set('display_errors', true);
ini_set('mbstring.internal_encoding','UTF-8');
ini_set("memory_limit", "1024M");
ini_set('max_execution_time', 300);
include('simple_html_dom.php');
// Create a DOM object
$html = new simple_html_dom();
$html->load_file('stok.html');
$table = array();
$kac = array();
foreach($html->find('tr') as $row) {
$satir = array();
$j = 0;
foreach($row->find('td') as $element) {
if($kac[$j]['deger']>0){
$satir[]='';
$kac[$j]['deger']=$kac[$j]['deger']-1;
$j++;
while($kac[$j]['deger']>0){
$satir[]='';
$kac[$j]['deger']=$kac[$j]['deger']-1;
$j++;
}
}else{
$j++;
if(isset($element->rowspan)){
$kac[$j]['deger']=($element->rowspan)-1;
}
$satir[] = str_replace(' ', '', strip_tags($element->innertext));
}
if(isset($element->colspan)){
$sayi=($element->colspan)-1;
for($i=1;$i<=$sayi;$i++){
$satir[] = '';
}
}
}
$table[] = $satir;
}
echo '<pre>';
print_r($table);
echo '</pre>';
?>
Run Code Online (Sandbox Code Playgroud)
我的当前输出样本:(看到一些数组值中有21个,23个和17个项目.正确的一个是21个项目.(20个作为索引值)) - 不要删除示例输出中的表值 -
Array
(
[0] => Array
(
)
[1] => Array
(
[0] => Envanter (R/B/K) (Filitre Kodu : sa) (Envanter Tarihi :28/11/2012 ) (Depo : 100)
[1] =>
[2] =>
[3] =>
[4] =>
[5] =>
[6] =>
[7] =>
[8] =>
[9] =>
[10] =>
[11] =>
[12] =>
[13] =>
[14] =>
[15] =>
[16] =>
[17] =>
[18] =>
[19] =>
[20] =>
)
[2] => Array
(
[0] => Model
[1] => Stok Ad?
[2] => R
[3] => Renk Ad?
[4] => B
[5] => B
[6] => B
[7] => B
[8] => B
[9] => B
[10] => B
[11] => B
[12] => B
[13] => B
[14] => B
[15] => B
[16] => B
[17] => B
[18] => B
[19] => Toplam
[20] => Resim
)
[3] => Array
(
[0] =>
[1] =>
[2] =>
[3] =>
[4] => 34
[5] => 36
[6] => 38
[7] => 40
[8] => 42
[9] => 44
[10] => 46
[11] => 48
[12] => 50
[13] => 52
[14] => 54
[15] => 56
[16] => 58
[17] => 60
[18] => 62
[19] => Toplam
[20] =>
)
[4] => Array
(
[0] => 1K011621110
[1] => NIHAN 2111 KABAN
[2] => 064
[3] => FES
[4] =>
[5] =>
[6] =>
[7] =>
[8] =>
[9] =>
[10] =>
[11] =>
[12] => 1.00
[13] =>
[14] =>
[15] =>
[16] =>
[17] =>
[18] =>
[19] => 1.00
[20] => Resim
)
[5] => Array
(
[0] =>
[1] =>
[2] => Toplam :
[3] =>
[4] =>
[5] =>
[6] =>
[7] =>
[8] =>
[9] =>
[10] =>
[11] =>
[12] =>
[13] =>
[14] =>
[15] => 1.00
[16] =>
[17] =>
[18] =>
[19] =>
[20] =>
[21] =>
[22] => 1.00
[23] =>
)
[6] => Array
(
[0] =>
[1] => 34
[2] => 36
[3] => 38
[4] => 40
[5] => 42
[6] => 44
[7] => 46
[8] => 48
[9] => 50
[10] => 52
[11] => 54
[12] => 56
[13] => 58
[14] => 60
[15] => 62
[16] => Toplam
[17] =>
)
[7] => Array
(
[0] => 1K011624760
[1] => NIHAN 2476 KABAN
[2] => 001
[3] => SIYAH
[4] =>
[5] =>
[6] =>
[7] =>
[8] =>
[9] => 1.00
[10] =>
[11] => 1.00
[12] =>
[13] =>
[14] =>
[15] =>
[16] =>
[17] =>
[18] =>
[19] => 2.00
[20] => Resim
)
Run Code Online (Sandbox Code Playgroud)
提前致谢.
使用工作代码更新解决方案:目前用"***"填充所有空单元格
<?php
ini_set('display_errors', true);
ini_set('mbstring.internal_encoding','UTF-8');
ini_set("memory_limit", "1024M");
ini_set('max_execution_time', 300);
include('simple_html_dom.php');
// Create a DOM object
$html = new simple_html_dom();
$html->load_file('stok.html');
$satir = array();
$rowcount = 0;
foreach($html->find('tr') as $row) {
$colcount = 0;
foreach($row->find('td') as $element) {
while($satir[$rowcount][$colcount]!=''){
$colcount++;
}
$satir[$rowcount][$colcount] = strip_tags(str_replace(' ', '***', $element->innertext));
if(isset($element->colspan)){
$sayi=($element->colspan)-1;
for($i=1;$i<=$sayi;$i++){
$satir[$rowcount][$colcount+$i] = '***';
}
}
if(isset($element->rowspan)){
$sayi=($element->rowspan)-1;
for($i=1;$i<=$sayi;$i++){
$satir[$rowcount+$i][$colcount] = '***';
}
}
$colcount++;
}
$rowcount++;
}
echo '<pre>';
print_r($satir);
echo '</pre>';
?>
Run Code Online (Sandbox Code Playgroud)
根据@deceze 的有用评论,我使用了不同的方法来解决该问题。下面的代码将完成这项工作。但它会用 填充所有空白字段***。您可能需要重新访问整个数组以清空它。(此操作的代码位于下面)
// Create a DOM object
$html = new simple_html_dom();
$html->load_file('stok.html');
$satir = array();
$rowcount = 0;
foreach($html->find('tr') as $row) {
$colcount = 0;
foreach($row->find('td') as $element) {
while($satir[$rowcount][$colcount]!=''){
$colcount++;
}
$satir[$rowcount][$colcount] = strip_tags(str_replace(' ', '***', $element->innertext));
if(isset($element->colspan)){
$sayi=($element->colspan)-1;
for($i=1;$i<=$sayi;$i++){
$satir[$rowcount][$colcount+$i] = '***';
}
}
if(isset($element->rowspan)){
$sayi=($element->rowspan)-1;
for($i=1;$i<=$sayi;$i++){
$satir[$rowcount+$i][$colcount] = '***';
}
}
$colcount++;
}
$rowcount++;
}
echo '<pre>';
print_r($satir);
echo '</pre>';
?>
Run Code Online (Sandbox Code Playgroud)
下面的代码块将从上面提到的那些星号中清除数组。
$itemcount=count($satir)-1;
for($i=1; $i<=$itemcount; $i++){
for($j=0; $j<=20; $j++){
if($satir[$i][$j]=='***'){
$satir[$i][$j]='';
}
}
}
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1558 次 |
| 最近记录: |