您可以使用标准XSLT 2.0函数unparsed-text()直接在XSLT 2.0代码中读取文本文件.
然后使用:
replace(concat(normalize-space($text),' '),
'(.{0,60}) ',
'$1
')
Run Code Online (Sandbox Code Playgroud)
说明:
这首先规范化空白区域,删除仅空白字符的前导和尾随序列,并用单个空格替换任何内部此类序列.
然后将规范化的结果用作标准XPath 2.0函数的第一个参数replace().
匹配模式是任意(最长61个字符的最长序列,以空格结尾.
replacement参数指定找到的任何此类序列应由结束空格之前的字符串替换,并与NL字符连接.
这是一个完整的解决方案,从文件中读取和格式化此文本C:\temp\delete\text.txt:
Dec. 13 — As always for a presidential inaugural, security and surveillance were
extremely tight in Washington, DC, last January. But as George W. Bush prepared to
take the oath of office, security planners installed an extra layer of protection: a
prototype software system to detect a biological attack. The U.S. Department of
Defense, together with regional health and emergency-planning agencies, distributed
a special patient-query sheet to military clinics, civilian hospitals and even aid
stations along the parade route and at the inaugural balls. Software quickly
analyzed complaints of seven key symptoms — from rashes to sore throats — for
patterns that might indicate the early stages of a bio-attack. There was a brief
scare: the system noticed a surge in flulike symptoms at military clinics.
Thankfully, tests confirmed it was just that — the flu.
Run Code Online (Sandbox Code Playgroud)
XSLT代码:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="text"/>
<xsl:variable name="vText" select=
"unparsed-text('file:///c:/temp/delete/text.txt')"/>
<xsl:template match="/">
<xsl:sequence select=
"replace(concat(normalize-space($vText),' '),
'(.{0,60}) ',
'$1
')
"/>
</xsl:template>
</xsl:stylesheet>
Run Code Online (Sandbox Code Playgroud)
结果是一组线,每条线的固定长度不超过60:
Dec. 13 — As always for a presidential inaugural, security
and surveillance were extremely tight in Washington, DC,
last January. But as George W. Bush prepared to take the
oath of office, security planners installed an extra layer
of protection: a prototype software system to detect a
biological attack. The U.S. Department of Defense, together
with regional health and emergency-planning agencies,
distributed a special patient-query sheet to military
clinics, civilian hospitals and even aid stations along the
parade route and at the inaugural balls. Software quickly
analyzed complaints of seven key symptoms — from rashes to
sore throats — for patterns that might indicate the early
stages of a bio-attack. There was a brief scare: the system
noticed a surge in flulike symptoms at military clinics.
Thankfully, tests confirmed it was just that — the flu.
Run Code Online (Sandbox Code Playgroud)
更新:
如果文本来自XML文件,则可以通过对上述解决方案的最小更改来完成此操作:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:sequence select=
"replace(concat(normalize-space(text),' '),
'(.{0,60}) ',
'$1
')
"/>
</xsl:template>
</xsl:stylesheet>
Run Code Online (Sandbox Code Playgroud)
这里我假设所有文本都在textXML文档的顶部元素(命名)的唯一文本节点子节点中:
<text>
Dec. 13 — As always for a presidential inaugural, security and surveillance were
extremely tight in Washington, DC, last January. But as George W. Bush prepared to
take the oath of office, security planners installed an extra layer of protection: a
prototype software system to detect a biological attack. The U.S. Department of
Defense, together with regional health and emergency-planning agencies, distributed
a special patient-query sheet to military clinics, civilian hospitals and even aid
stations along the parade route and at the inaugural balls. Software quickly
analyzed complaints of seven key symptoms — from rashes to sore throats — for
patterns that might indicate the early stages of a bio-attack. There was a brief
scare: the system noticed a surge in flulike symptoms at military clinics.
Thankfully, tests confirmed it was just that — the flu.
</text>
Run Code Online (Sandbox Code Playgroud)
将此转换应用于上述XML文档时,将生成与第一个解决方案相同的结果.
| 归档时间: |
|
| 查看次数: |
6053 次 |
| 最近记录: |