WordPress：根据内容的标题生成目录

Question

WordPress：根据内容的标题生成目录

Cra*_*ray -1 html php regex wordpress replace

我想根据我的文章标题生成一个目录列表。

\n

我已经找到了一个解决方案，可以从内容中获取所有标题并<h2>用<a>标签替换标签。

\n

问题是，我还需要用<h3>链接替换标签并在链接列表中显示它们。

\n

我的结果应该是这样的：

\n

<ul>\n    <li><a href="#h2-1">I was a H2 headline</a></li>\n    <li>\n        <a href="#h2-2">Also a H2 headline</a>\n        <ul>\n            <li><a href="#h3-1">H3 headline</a></li>\n            <li><a href="#h3-2">Another H3 headline</a></li>\n        </ul>\n    </li>\n</ul>\n

Run Code Online (Sandbox Code Playgroud)\n

我的问题是，某些标题可能有class=""元素，而其他标题则没有。目前，我用删除了所有class=""内容str_replace。\n这不是最好的解决方案，但它对我有用，而且我对正则表达式知之甚少。

\n

以下代码是我从内容中获取每个标题的函数。

\n

我首先获取帖子的内容并将其存储在$content.

\n

从那里我得到所有的标题（<h2>- <h6>）并将它们存储在$results这一行中：

\n

preg_match_all(\'#<h[2-6]*[^>]*>.*?<\\/h[2-6]>#\',$content,$results);\n

Run Code Online (Sandbox Code Playgroud)\n

目前我只使用<h2>标题，因为我不确定如何以智能方式做到这一点，并且我必须为每个标题级别重复以下几行：

\n

$toc = str_replace(\'<h2\',\'<li><a\',$toc);\n$toc = str_replace(\'</h2>\',\'</a></li>\',$toc);\n

Run Code Online (Sandbox Code Playgroud)\n

但我最大的问题是标题的嵌套。\n我怎样才能生成像上面这样的 HTML 代码？

\n

同样重要的是：我如何处理不同的标题格式，例如：

\n

<h2 class="style" id="name">
<h2 id="name" class="style">
<h2 id="name">

\n

这是我当前的代码：

\n

$content_postid = get_the_ID();\n$content_post   = get_post($content_postid);\n$content        = $content_post->post_content;\n$content        = apply_filters(\'the_content\', $content);\n$content        = str_replace(\']]>\', \']]&gt;\', $content);\n\npreg_match_all(\'#<h[2-6]*[^>]*>.*?<\\/h[2-6]>#\',$content,$results);\n\n$toc = implode("\\n",$results[0]);\n\n// This part is messy because I don\'t really understand regex :-(\n$toc = preg_replace(\'/class=".*?"/\', \'\', $toc);\n$toc = str_replace(\'<strong>\',\'\',$toc);\n$toc = str_replace(\'</strong>\',\'\',$toc);\n$toc = str_replace(\'<h2\',\'<li><a\',$toc);\n$toc = str_replace(\'</h2>\',\'</a></li>\',$toc);\n$toc = str_replace(\'id="\',\'href="#\',$toc);\n\n//plug the results into appropriate HTML tags\n$toc = \'<div id="toc">\n<ul class="list-unstyled">\n\'.$toc.\'\n</ul>\n</div>\';\n\necho $toc;\n

Run Code Online (Sandbox Code Playgroud)\n

这是我当前的输出（如您所见，只有<h2>标题）：

\n

<ul class="list-unstyled">\n    <li><a href="#h2-1">I was a H2 headline</a></li>\n    <li><a href="#h2-2">Also a H2 headline</a></li>\n</ul>\n

Run Code Online (Sandbox Code Playgroud)\n

编辑：这是一个示例 HTML 代码，可能位于以下位置$content：

\n

<p>Lorem ipsum dolor sit amet...</p>\n<p>consectetuer adipiscing elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna aliquam erat volutpat</p>\n<img src="/path/to/image.jpg" />\n<h2 class="style" id="name">\n<p>Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat</p>\n<p>Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat</p> \n<h3 class="style" id="name">Headline 3</h3>\n<p>vel illum dolore eu feugiat nulla facilisis at vero et accumsan et iusto odio dignissim qui</p>\n<h3 class="style" id="name">On more Headline 3</h3>\n<p>blandit praesent luptatum zzril delenit augue duis dolore te feugait nulla facilisi</p>\n<h2 id="name" class="style">Headline 2 with class</h2>\n<p>Nam liber tempor cum soluta nobis eleifend option congue nihil imperdiet</p>\n<h2 id="name">Another Headline 2 without class</h2>\n<p>doming id quod mazim placerat facer possim assum</p>\n

Run Code Online (Sandbox Code Playgroud)\n

编辑2：

\n

我找到了一个看起来正确的函数（此处）。但我无法让它发挥作用。

\n

DOMDocument 我还发现了一个在这里明确使用的函数。但我现在正在对此进行测试。目前它显示了全部内容。

\n

这是其中的代码：

\n

$doc = new DOMDocument();\n$doc->loadHTML($code);\n\n// create document fragment\n$frag = $doc->createDocumentFragment();\n// create initial list\n$frag->appendChild($doc->createElement(\'ol\'));\n$head = &$frag->firstChild;\n$xpath = new DOMXPath($doc);\n$last = 1;\n\n// get all H1, H2, \xe2\x80\xa6, H6 elements\nforeach ($xpath->query(\'//*[self::h1 or self::h2 or self::h3 or self::h4 or self::h5 or self::h6]\') as $headline) {\n    // get level of current headline\n    sscanf($headline->tagName, \'h%u\', $curr);\n\n    // move head reference if necessary\n    if ($curr < $last) {\n        // move upwards\n        for ($i=$curr; $i<$last; $i++) {\n            $head = &$head->parentNode->parentNode;\n        }\n    } else if ($curr > $last && $head->lastChild) {\n        // move downwards and create new lists\n        for ($i=$last; $i<$curr; $i++) {\n            $head->lastChild->appendChild($doc->createElement(\'ol\'));\n            $head = &$head->lastChild->lastChild;\n        }\n    }\n    $last = $curr;\n\n    // add list item\n    $li = $doc->createElement(\'li\');\n    $head->appendChild($li);\n    $a = $doc->createElement(\'a\', $headline->textContent);\n    $head->lastChild->appendChild($a);\n\n    // build ID\n    $levels = array();\n    $tmp = &$head;\n    // walk subtree up to fragment root node of this subtree\n    while (!is_null($tmp) && $tmp != $frag) {\n        $levels[] = $tmp->childNodes->length;\n        $tmp = &$tmp->parentNode->parentNode;\n    }\n    $id = \'sect\'.implode(\'.\', array_reverse($levels));\n    // set destination\n    $a->setAttribute(\'href\', \'#\'.$id);\n    // add anchor to headline\n    $a = $doc->createElement(\'a\');\n    $a->setAttribute(\'name\', $id);\n    $a->setAttribute(\'id\', $id);\n    $headline->insertBefore($a, $headline->firstChild);\n}\n\n// append fragment to document\n$doc->getElementsByTagName(\'body\')->item(0)->appendChild($frag);\n\n// echo markup\necho $doc->saveHTML();\n

Run Code Online (Sandbox Code Playgroud)\n

Answer 1

Cas*_*yte 5

一种仅使用 DOM 从 html 源代码中解析和提取相关信息的方法。然后将结果构建为字符串。

libxml_use_internal_errors(true);

$dom = new DOMDocument;
$dom->loadHTML($html);

$xp = new DOMXPath($dom);
$nodes = $xp->query('//*[contains("h1 h2 h3 h4 h5 h6", name())]');

$currentLevel = ['level' => 0 /*, 'count' => 0*/ ];
$stack = [];
$format = '<li><a href="#%s">%s</a></li>';
$result = '';

foreach($nodes as $node) {
    $level = (int)$node->tagName[1]; // extract the digit after h
  
    while($level < $currentLevel['level']) {
        $currentLevel = array_pop($stack);
        $result .= '</ul>';
    }
    
    if ($level === $currentLevel['level']) {
        $currentLevel['count']++;
    } else {
        $stack[] = $currentLevel;
        $currentLevel = ['level' => $level, 'count' => 1];
        $result .= '<ul>';
    }

    $result .= sprintf($format, $node->getAttribute('id'), $node->nodeValue);    
}

$result .= str_repeat('</ul>', count($stack));

Run Code Online (Sandbox Code Playgroud)

演示

为了逐步构建预期的树结构，此代码使用堆栈 (FILO)，该堆栈存储具有级别（h 后面的数字）的数组以及已为此级别添加的节点数。当当前节点的级别高于前一个节点时，则将数组存储在堆栈中。如果当前节点的级别低于前一个节点，则将最后一个元素出栈（直到最后一个元素的级别更高或相等）。如果当前节点和前一个节点的级别相同，则堆栈保持不变，并且数组中的计数项递增。

主循环之后，代码对堆栈中的剩余项目进行计数，以正确关闭标签ul。

xpath查询详细信息：

 //*        [contains("h1 h2 h3 h4 h5 h6", name())]
|___|      |_______________________________________|
location   predicate
path

Run Code Online (Sandbox Code Playgroud)

位置路径：

//DOM 树中从当前位置（默认为根）开始的所有位置
*任意元素节点

谓词：

name()返回当前元素名称
contains(haystack, needle)

归档时间：	4 年前
查看次数：	691 次
最近记录：	4 年前