我必须解析外部提供的XML,该XML具有包含换行符的属性.使用SimpleXML,换行似乎丢失了.根据另一个stackoverflow问题,换行符应该对XML有效(即使远远不够理想!).
他们为什么输了?[编辑] 我怎样才能保存它们?[/编辑]
这是一个演示文件脚本(请注意,当换行符不在属性中时,它们会被保留).
带嵌入式XML的PHP文件
$xml = <<<XML
<?xml version="1.0" encoding="utf-8"?>
<Rows>
<data Title='Data Title' Remarks='First line of the row.
Followed by the second line.
Even a third!' />
<data Title='Full Title' Remarks='None really'>First line of the row.
Followed by the second line.
Even a third!</data>
</Rows>
XML;
$xml = new SimpleXMLElement( $xml );
print '<pre>'; print_r($xml); print '</pre>';
Run Code Online (Sandbox Code Playgroud)
print_r的输出
SimpleXMLElement Object
(
[data] => Array
(
[0] => SimpleXMLElement Object
(
[@attributes] => Array
(
[Title] => Data Title
[Remarks] => First line of the row. Followed by the second line. Even a third!
)
)
[1] => First line of the row.
Followed by the second line.
Even a third!
)
)
Run Code Online (Sandbox Code Playgroud)
新行的实体是 。我研究了你的代码,直到找到了可以解决问题的东西。这不是很优雅,我警告你:
//First remove any indentations:
$xml = str_replace(" ","", $xml);
$xml = str_replace("\t","", $xml);
//Next replace unify all new-lines into unix LF:
$xml = str_replace("\r","\n", $xml);
$xml = str_replace("\n\n","\n", $xml);
//Next replace all new lines with the unicode:
$xml = str_replace("\n"," ", $xml);
Finally, replace any new line entities between >< with a new line:
$xml = str_replace("> <",">\n<", $xml);
Run Code Online (Sandbox Code Playgroud)
根据您的示例,假设节点或属性内出现的任何新行将在下一行上有更多文本,而不是<打开新元素。
如果您的下一行有一些文本包含在行级元素中,这当然会失败。