Cir*_*e B 5 php latex preg-replace cpu-word adobe-indesign
[更新]
这是我的任务 - 将一堆自定义构建的 LaTeX 文件转换为 InDesign。所以我目前的方法是:通过 PHP 脚本运行 .tex 文件,将自定义 LaTeX 代码更改为更通用的 TeX 代码,然后我使用 TeX2Word 将它们转换为 .doc 文件,然后将它们放入 InDesign。
我想要做的preg_replace是转换一些 TeX 标签,这样它们就不会被 TeX2Word 触及,然后我将能够在 InDesign 中运行一个脚本,将类似 HTML 的标签更改为 InDesign 文本框架、脚注、变量等。
[/更新]
我有一些带有 LaTeX 标记的文本:
$newphrase = "\blockquote{\hspace*{.5em}Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Integer posuere erat a ante venenatis dapibus posuere
velit aliquet. Aenean lacinia bibendum nulla sed consectetur. Aenean
eu leo quam. Pellentesque ornare sem lacinia quam venenatis
vestibulum. Sed posuere consectetur est at lobortis. \note{Integer
posuere erat a ante venenatis dapibus posuere velit aliquet.
\textit{Vivamus} sagittis lacus vel augue laoreet rutrum faucibus
dolor auctor.}}";
Run Code Online (Sandbox Code Playgroud)
我想要做的是删除\blockquote{...}并替换为<div>...</div>
所以我尝试了很多不同的版本:
$regex = "#(blockquote){(.*)(})#";
$replace = "<div>$2</div>";
$newphrase = preg_replace($regex,$replace,$newphrase);
Run Code Online (Sandbox Code Playgroud)
这是输出
\<div>\hspace*{.5em</div>Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Integer posuere erat a ante venenatis dapibus posuere
velit aliquet. Aenean lacinia bibendum nulla sed consectetur. Aenean
eu leo quam. Pellentesque ornare sem lacinia quam venenatis
vestibulum. Sed posuere consectetur est at lobortis. \note{Integer
posuere erat a ante venenatis dapibus posuere velit aliquet.
\textit{Vivamus} sagittis lacus vel augue laoreet rutrum faucibus
dolor auctor.}}";
Run Code Online (Sandbox Code Playgroud)
它的第一个问题是它替换了从\blockquote{到第一个}. 当我希望它忽略下一个时,}如果{在初始\blockquote{.
我遇到的下一个问题是\我似乎无法逃脱它!我试过\\, /\\/, \\\, /\\\/, [\], [\\]. 没有任何作用!我确定这是因为我不明白它到底是如何工作的。
所以最后,这就是我想要的结果:
<div>\hspace*{.5em}Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Integer posuere erat a ante venenatis dapibus posuere
velit aliquet. Aenean lacinia bibendum nulla sed consectetur. Aenean
eu leo quam. Pellentesque ornare sem lacinia quam venenatis
vestibulum. Sed posuere consectetur est at lobortis. \note{Integer
posuere erat a ante venenatis dapibus posuere velit aliquet.
\textit{Vivamus} sagittis lacus vel augue laoreet rutrum faucibus
dolor auctor.}</div>";
Run Code Online (Sandbox Code Playgroud)
我打算让$regex与$replace成阵列,这样我就可以取代之类的东西\textit{Vivamus}与此<em>Vivamus</em>
任何指导都会受到欢迎和赞赏!
如果您仍然想自己进行转换,可以使用多次传递字符串来完成转换,首先替换内部元素:
$t = '\blockquote{\hspace*{.5em}Lorem ipsum dolor sit amet, consectetur
adipiscing elit. Integer posuere erat a ante venenatis dapibus posuere
velit aliquet. Aenean lacinia bibendum nulla sed consectetur. Aenean
eu leo quam. Pellentesque ornare sem lacinia quam venenatis
vestibulum. Sed posuere consectetur est at lobortis. \note{Integer
posuere erat a ante venenatis dapibus posuere velit aliquet.
\textit{Vivamus} sagittis lacus vel augue laoreet rutrum faucibus
dolor auctor.}}';
function hspace($m) { return "<br />"; }
function textit($m) { return "<i>" . $m[1] . "</i>"; }
function note($m) { return "<b>" . $m[1] . "</b>"; }
function blockquote($m) { return "<quote>" . $m[1] . "</quote>"; }
while (true) {
$newt = $t;
$newt = preg_replace_callback("/\\\\hspace\\*\\{([^{}]*?)\\}/", "hspace", $newt);
$newt = preg_replace_callback("/\\\\textit\\{([^{}]*?)\\}/", "textit", $newt);
$newt = preg_replace_callback("/\\\\note\\{([^{}]*?)\\}/", "note", $newt);
$newt = preg_replace_callback("/\\\\blockquote{([^{}]*?)\\}/", "blockquote", $newt);
if ($newt == $t) break;
$t = $newt;
}
echo $t;
Run Code Online (Sandbox Code Playgroud)
当然,这可能适用于简单的示例,但您无法使用此方法来正确解析整个 TeX 格式。此外,对于较长的输入,它会变得非常无效。
| 归档时间: |
|
| 查看次数: |
1392 次 |
| 最近记录: |