用BR标签替换换行符,但仅在PRE标签内

Gre*_*ind 3 php regex html-parsing

有货PHP5,preg_replace这个转换的好表达是什么:

<br />,但只在<pre>块内替换换行符

(随意做出简化假设,并忽略极端情况.例如,我们可以假设标签将是一行,而不是像病态一样)

输入文本:

<div><pre class='some class'>1
2
3
</pre>
<pre>line 1
line 2
line 3
</pre>
</div>
Run Code Online (Sandbox Code Playgroud)

输出:

<div><pre>1<br />2<br />3<br /></pre>
<pre>line 1<br />line 2<br />line 3<br /></pre>
</div>
Run Code Online (Sandbox Code Playgroud)

(激励上下文:尝试在维基词典的SyntaxHighlight_GeSHI扩展中关闭错误20760,并找到我的PHP技能(我主要做python)不符合要求).

除了regexen之外,我对其他解决方案持开放态度,但小的是首选(例如,构建html解析机制是过度的).

med*_*iev 6

像这样的东西?

<?php

$content = "<div><pre class='some class'>1
2
3
</pre>
<pre>line 1
line 2
line 3
</pre>
</div>
";

function getInnerHTML($Node)
{
     $Body = $Node->ownerDocument->documentElement->firstChild->firstChild;
     $Document = new DOMDocument();    
     $Document->appendChild($Document->importNode($Body,true));
     return $Document->saveHTML();
}

$dom = new DOMDocument();
$dom->loadHTML( $content );
$preElements = $dom->getElementsByTagName('pre');

if ( count( $preElements ) ) {
    foreach ( $preElements as $pre ) {
    $value = preg_replace( '/\n|\r\n/', '<br/>', $pre->nodeValue  );
    $pre->nodeValue = $value;
    }

    echo html_entity_decode( getInnerHTML( $dom->documentElement ) );
}
Run Code Online (Sandbox Code Playgroud)