将 HTML 标签字符串转换为数组?

d13*_*d13 -2 html javascript regex arrays string

我正在尝试将 HTML 字符串转换为 HTML 数组。例如,我可能有一个如下所示的任意 HTML 字符串:

"<div>This</div><h1>Is</h1> <p>A</p> <a href="#">Test</a>"

(There may or may not be spaces between the tag elements)

I'm trying to convert it into an array that looks like this:

["<div>This</div>", "<h1>Is</h1>", "<p>A</p>", "<a href="#">Test</a>"]
Run Code Online (Sandbox Code Playgroud)

This is just for displaying the tags as text - I'm not going to use them as HTML elements.

I have looked at this example here, and it almost works except it strips the tags from the inner text: /sf/answers/3803844131/

I'm looking for a solution that does not involve DOM parsing - if possible.

Any suggestions welcome!

Rok*_*jan 9

Always use a DOMParser to parse HTML strings.

const string = `<div>This</div><h1>Is</h1> <p>A</p> <a href="#">Test</a>`;

const doc = new DOMParser().parseFromString(string, "text/html");
const HTMLArray = [...doc.body.children].map(el => el.outerHTML);

console.log(HTMLArray)
Run Code Online (Sandbox Code Playgroud)

You should never use RegExp to parse XML/HTML strings. But if you really want, and know your string by hearth and looks as you provided it...

const str = `<div>This</div><h1>Is</h1> <p>A</p> <a href="#">Test</a>`;
const m = str.match(/<[^>]+>[^<]*<\/[^>]+>/g); // Use at your own risk

console.log(m); 
Run Code Online (Sandbox Code Playgroud)

请注意,上述内容不适用于深度嵌套的 HTML,并且如果在属性中有一个<or>字符(这是完全有效且不常见的)