如何将一小部分 Markdown 解析为 React 组件？

Question

如何将一小部分 Markdown 解析为 React 组件？

Rya*_*hel 10 javascript regex arrays markdown reactjs

我有一个非常小的 Markdown 子集以及一些我想解析为 React 组件的自定义 html。例如，我想转换以下字符串：

hello *asdf* *how* _are_ you !doing! today

进入以下数组：

[ "hello ", asdf, " ", how, " ", are, " you ", <MyComponent onClick={this.action}>doing</MyComponent>, " today" ]

然后从 React 渲染函数返回它（React 会将数组正确渲染为格式化的 HTML）

基本上，我想让用户选择使用非常有限的 Markdown 集将他们的文本转换为样式组件（在某些情况下是我自己的组件！）

危险地SetInnerHTML是不明智的，我不想引入外部依赖，因为它们都很重，我只需要非常基本的功能。

我目前正在做这样的事情，但它非常脆弱，并且不适用于所有情况。我想知道是否有更好的方法：

function matchStrong(result, i) {
  let match = result[i].match(/(^|[^\\])\*(.*)\*/);
  if (match) { result[i] = <strong key={"ms" + i}>{match[2]}</strong>; }
  return match;
}

function matchItalics(result, i) {
  let match = result[i].match(/(^|[^\\])_(.*)_/); // Ignores \_asdf_ but not _asdf_
  if (match) { result[i] = <em key={"mi" + i}>{match[2]}</em>; }
  return match;
}

function matchCode(result, i) {
  let match = result[i].match(/(^|[^\\])```\n?([\s\S]+)\n?```/);
  if (match) { result[i] = <code key={"mc" + i}>{match[2]}</code>; }
  return match;
}

// Very brittle and inefficient
export function convertMarkdownToComponents(message) {
  let result = message.match(/(\\?([!*_`+-]{1,3})([\s\S]+?)\2)|\s|([^\\!*_`+-]+)/g);

  if (result == null) { return message; }

  for (let i = 0; i < result.length; i++) {
    if (matchCode(result, i)) { continue; }
    if (matchStrong(result, i)) { continue; }
    if (matchItalics(result, i)) { continue; }
  }

  return result;
}

Run Code Online (Sandbox Code Playgroud)

这是我之前的问题，导致了这个问题。

Answer 1

Ale*_*gin 5

看起来您正在寻找一个小型的非常基本的解决方案。不是像“超级怪物”那样react-markdown-it:)

我想向您推荐https://github.com/developit/snarkdown，它看起来非常轻量且漂亮！只有 1kb 并且非常简单，如果您需要任何其他语法功能，您可以使用它并扩展它。

支持的标签列表https://github.com/developit/snarkdown/blob/master/src/index.js#L1

更新

刚刚注意到反应组件，一开始就错过了。因此，这对您来说非常好，我相信以该库为例并实现您的自定义所需组件即可完成它，而无需危险地设置 HTML。图书馆很小而且很干净。玩得开心！:)

Answer 2

LuD*_*nin 2

怎么运行的？

它的工作原理是逐块读取字符串，这可能不是真正长字符串的最佳解决方案。

每当解析器检测到正在读取关键块（即'*'或任何其他降价标签）时，它就会开始解析该元素的块，直到解析器找到其结束标签。

它适用于多行字符串，请参阅示例代码。

注意事项

您没有指定，或者我可能误解了您的需求，如果需要解析粗体和斜体的标签，我当前的解决方案在这种情况下可能不起作用。

但是，如果您需要处理上述条件，只需在此处发表评论，我将调整代码。

第一次更新：调整 Markdown 标签的处理方式

标签不再是硬编码的，而是一个映射，您可以轻松扩展以满足您的需求。

修复了您在评论中提到的错误，感谢您指出这个问题=p

第二次更新：多长度 Markdown 标签

实现此目的的最简单方法：用很少使用的 unicode 替换多长度字符

尽管该方法parseMarkdown尚不支持多长度标签，但我们可以string.replace 在发送rawMarkdownprop 时轻松地用简单的标签替换这些多长度标签。

要查看实践中的示例，请查看ReactDOM.render位于代码末尾的。

即使你的应用程序确实支持多种语言，JavaScript 仍然会检测到无效的 unicode 字符，例如："\uFFFF"不是有效的 unicode，如果我没记错的话，但 JS 仍然能够比较它（"\uFFFF" === "\uFFFF" = true）

乍一看似乎很麻烦，但是根据您的用例，我没有发现使用此路线有任何重大问题。

实现这一目标的另一种方法

好吧，我们可以轻松跟踪最后一个块N（N对应于最长多长度标签的长度）。

对方法内部循环的行为方式进行一些调整 parseMarkdown，即检查当前块是否是多长度标签的一部分，如果将其用作标签；否则，在类似的情况下``k，我们需要将其标记为notMultiLength或类似的内容，并将该块作为内容推送。

代码

// Instead of creating hardcoded variables, we can make the code more extendable
// by storing all the possible tags we'll work with in a Map. Thus, creating
// more tags will not require additional logic in our code.
const tags = new Map(Object.entries({
  "*": "strong", // bold
  "!": "button", // action
  "_": "em", // emphasis
  "\uFFFF": "pre", // Just use a very unlikely to happen unicode character,
                   // We'll replace our multi-length symbols with that one.
}));
// Might be useful if we need to discover the symbol of a tag
const tagSymbols = new Map();
tags.forEach((v, k) => { tagSymbols.set(v, k ); })

const rawMarkdown = `
  This must be *bold*,

  This also must be *bo_ld*,

  this _entire block must be
  emphasized even if it's comprised of multiple lines_,

  This is an !action! it should be a button,

  \`\`\`
beep, boop, this is code
  \`\`\`

  This is an asterisk\\*
`;

class App extends React.Component {
  parseMarkdown(source) {
    let currentTag = "";
    let currentContent = "";

    const parsedMarkdown = [];

    // We create this variable to track possible escape characters, eg. "\"
    let before = "";

    const pushContent = (
      content,
      tagValue,
      props,
    ) => {
      let children = undefined;

      // There's the need to parse for empty lines
      if (content.indexOf("\n\n") >= 0) {
        let before = "";
        const contentJSX = [];

        let chunk = "";
        for (let i = 0; i < content.length; i++) {
          if (i !== 0) before = content[i - 1];

          chunk += content[i];

          if (before === "\n" && content[i] === "\n") {
            contentJSX.push(chunk);
            contentJSX.push(<br />);
            chunk = "";
          }

          if (chunk !== "" && i === content.length - 1) {
            contentJSX.push(chunk);
          }
        }

        children = contentJSX;
      } else {
        children = [content];
      }
      parsedMarkdown.push(React.createElement(tagValue, props, children))
    };

    for (let i = 0; i < source.length; i++) {
      const chunk = source[i];
      if (i !== 0) {
        before = source[i - 1];
      }

      // Does our current chunk needs to be treated as a escaped char?
      const escaped = before === "\\";

      // Detect if we need to start/finish parsing our tags

      // We are not parsing anything, however, that could change at current
      // chunk
      if (currentTag === "" && escaped === false) {
        // If our tags array has the chunk, this means a markdown tag has
        // just been found. We'll change our current state to reflect this.
        if (tags.has(chunk)) {
          currentTag = tags.get(chunk);

          // We have simple content to push
          if (currentContent !== "") {
            pushContent(currentContent, "span");
          }

          currentContent = "";
        }
      } else if (currentTag !== "" && escaped === false) {
        // We'll look if we can finish parsing our tag
        if (tags.has(chunk)) {
          const symbolValue = tags.get(chunk);

          // Just because the current chunk is a symbol it doesn't mean we
          // can already finish our currentTag.
          //
          // We'll need to see if the symbol's value corresponds to the
          // value of our currentTag. In case it does, we'll finish parsing it.
          if (symbolValue === currentTag) {
            pushContent(
              currentContent,
              currentTag,
              undefined, // you could pass props here
            );

            currentTag = "";
            currentContent = "";
          }
        }
      }

      // Increment our currentContent
      //
      // Ideally, we don't want our rendered markdown to contain any '\'
      // or undesired '*' or '_' or '!'.
      //
      // Users can still escape '*', '_', '!' by prefixing them with '\'
      if (tags.has(chunk) === false || escaped) {
        if (chunk !== "\\" || escaped) {
          currentContent += chunk;
        }
      }

      // In case an erroneous, i.e. unfinished tag, is present and the we've
      // reached the end of our source (rawMarkdown), we want to make sure
      // all our currentContent is pushed as a simple string
      if (currentContent !== "" && i === source.length - 1) {
        pushContent(
          currentContent,
          "span",
          undefined,
        );
      }
    }

    return parsedMarkdown;
  }

  render() {
    return (
      <div className="App">
        <div>{this.parseMarkdown(this.props.rawMarkdown)}</div>
      </div>
    );
  }
}

ReactDOM.render(<App rawMarkdown={rawMarkdown.replace(/```/g, "\uFFFF")} />, document.getElementById('app'));

Run Code Online (Sandbox Code Playgroud)

代码链接 (TypeScript) https://codepen.io/ludanin/pen/GRgNWPv

代码链接 (vanilla/babel) https://codepen.io/ludanin/pen/eYmBvXw

归档时间：	5 年，10 月前
查看次数：	3498 次
最近记录：	5 年，10 月前