使用php和regex获取标签和数据并存储为关联数组

mrp*_*atg 3 php regex

我如何使用正则表达式在页面中查找此表(需要按名称查找):

<table id="Table Name">
<tr><td class="label">Name:</td>
<td class="data"><div class="datainfo">Stuff</div></td></tr>
<tr><td class="label">Email:</td>
<td class="data"><div class="datainfo">Stuff2</div></td></tr>
<tr><td class="label">Address:</td>
<td class="data"><div class="datainfo">Stuff3</div></td></tr>
</table>
<table id="Table Name 2">
<tr><td class="label">Field1:</td>
<td class="data"><div class="datainfo">MoreStuff</div></td></tr>
<tr><td class="label">Field2:</td>
<td class="data"><div class="datainfo">MoreStuff2</div></td></tr>
<tr><td class="label">Field3:</td>
<td class="data"><div class="datainfo">MoreStuff3</div></td></tr>
</table>
Run Code Online (Sandbox Code Playgroud)

然后抓住"labels"和"datainfo"并将它们存储在一个关联数组中,例如:

$table_name[name] //Stuff
$table_name[email] //Stuff2
$table_name[address] //Stuff3

$table_name2[field1] //MoreStuff
$table_name2[field2] //Morestuff2
$table_name2[field3] //Morestuff3
Run Code Online (Sandbox Code Playgroud)

Iva*_*uev 8

在这种情况下,Regexp是不好的解决方案.请改用简单的HTML解析器.

更新: 这是函数:

 $html = str_get_html($html);
 print_r(get_table_fields($html, 'Table Name'));
 print_r(get_table_fields($html, 'Table Name 2'));

 function get_table_fields($html, $id) {
     $table = $html->find('table[id='.$id.']', 0);
     foreach ($table->find('tr') as $row) {
         $key = $row->find('td', 0)->plaintext;
         $value = $row->find('td', 1)->plaintext;
         ## remove ending ':' symbol
         $key = preg_replace('/:$/', '', $key);
         $result[$key] = $value;
     }
     return $result;
 }
Run Code Online (Sandbox Code Playgroud)

  • 请参阅http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454了解原因. (4认同)