Pav*_*n R 1 openxml tableofcontents page-numbering openxml-sdk
对于 Paragraph 对象,如何使用 Open XML SDK 2.5 确定它位于哪个页面上?
我已经获得了文档中的所有子元素,并使用 this 获取了内部文本。
foreach (var i in mainPart.Document.ChildElements.FirstOrDefault().ChildElements)
{
ParagraphElements.Add(i); //openxmlelement list
}
Run Code Online (Sandbox Code Playgroud)
我想获取相应段落的页码。例如,我将“这是标题 1”标记为样式标题 1,这将在目录中更新。所以我需要传递页码
提前致谢
Pages do not exist in the OpenXML format until they are rendered by a word processor.
The metadata necessary to calculate calculate on which page a given paragraph should appear is available, but it is far from a straightforward operation.
To verify that page numbers do not exist in the raw OpenXML markup:
This file is contains the XML content of your mainPart.Document call. The "document.xml" file has a single node, <document>...</document>, which has in turn a single child node, <body>...</body>, which in turn holds the content in which you're interested.
When working with OpenXML documents, I find that the abstractions in the OpenXML SDK can sometimes be distracting. Thankfully, its simple to explore the raw markup with LINQ-to-XML. For example, your call to:
var childrenFromOpenXmlSdk = mainPart.Document.ChildElements.Single().ChildElements;
Run Code Online (Sandbox Code Playgroud)
is equivalent to the following in LINQ-to-XML:
IEnumerable<XElement> childrenFromLinqToXml =
XElement.Load("[path]/[file]/word/document.xml")
.Elements()
.Single()
.Elements();`
Run Code Online (Sandbox Code Playgroud)
Inspecting the elements in the childrenFromLinqToXml you'll find no page number information.
You may see cached page numbers in the raw markup of the TOC itself, but these will be artifacts of the previous rendering, defined by content tags or form fields.
If you need to build up the TOC programmatically, have a look at the following sites:
OfficeOpenXML.com's reference article for TOCs
Eric White's screencast "Exploring Tables-of-Contents in Open XML WordprocessingML Documents"
ericwhite.com/blog is well-worth a look when you find yourself at the intersections of XML markup and on-screen rendering.--- Following up on the Sai's comments ---
Hi Austin Drenski, I've created TOC and added all headings programmatically. all I need is page numbers. is there any alternative to get page number of particular paragraph ? I've gone through all the screen casts. But I'm looking for page number alone.
<w:r> <w:fldChar w:fldCharType="begin" /> </w:r> <w:r> <w:instrText xml:space="preserve"> PAGEREF _Toc481680509 \h </w:instrText> </w:r> <w:r> <w:fldChar w:fldCharType="separate" /> </w:r> <w:r> <w:t>2</w:t> </w:r> <w:r> <w:fldChar w:fldCharType="end" /> </w:r>In that sample XML 2 "2" act as page number. That is hardcoded
now my TOC works perfectly without Pagenumber. where I also analysed default MS word functionality. First time, page numbers are literally given like above.
You can programmatically place a content control <w:sdt> in the document, as a child of the <w:body> element.
For a simple TOC with two entries:
<w:sdt>
<w:sdtPr>
<w:id w:val="429708664"/>
<w:docPartObj>
<w:docPartGallery w:val="Table of Contents"/>
<w:docPartUnique/>
</w:docPartObj>
</w:sdtPr>
<w:sdtContent>
<w:p>
<w:pPr>
<w:pStyle w:val="TOCHeading"/>
</w:pPr>
<w:r>
<w:t>Contents</w:t>
</w:r>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="TOC1"/>
<w:tabs>
<w:tab w:val="right" w:leader="dot" w:pos="9350"/>
</w:tabs>
</w:pPr>
<w:r>
<w:fldChar w:fldCharType="begin"/>
</w:r>
<w:r>
<w:instrText xml:space="preserve"> TOC \o "1-3" \h \z \u </w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate"/>
</w:r>
<w:hyperlink w:anchor="_Toc481654079" w:history="1">
<w:r>
<w:rPr>
<w:rStyle w:val="Hyperlink"/>
</w:rPr>
<w:t>Testing 1</w:t>
</w:r>
<w:r>
<w:tab/>
</w:r>
<w:r>
<w:fldChar w:fldCharType="begin"/>
</w:r>
<w:r>
<w:instrText xml:space="preserve"> PAGEREF _Toc481654079 \h </w:instrText>
</w:r>
<w:r>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate"/>
</w:r>
<w:r>
<w:t>0</w:t>
</w:r>
<w:r>
<w:fldChar w:fldCharType="end"/>
</w:r>
</w:hyperlink>
</w:p>
<w:p>
<w:pPr>
<w:pStyle w:val="TOC1"/>
<w:tabs>
<w:tab w:val="right" w:leader="dot" w:pos="9350"/>
</w:tabs>
</w:pPr>
<w:hyperlink w:anchor="_Toc481654080" w:history="1">
<w:r>
<w:rPr>
<w:rStyle w:val="Hyperlink"/>
</w:rPr>
<w:t>Testing 2</w:t>
</w:r>
<w:r>
<w:tab/>
</w:r>
<w:r>
<w:fldChar w:fldCharType="begin"/>
</w:r>
<w:r>
<w:instrText xml:space="preserve"> PAGEREF _Toc481654080 \h </w:instrText>
</w:r>
<w:r>
<w:fldChar w:fldCharType="separate"/>
</w:r>
<w:r>
<w:t>0</w:t>
</w:r>
<w:r>
<w:fldChar w:fldCharType="end"/>
</w:r>
</w:hyperlink>
</w:p>
<w:p>
<w:r>
<w:fldChar w:fldCharType="end"/>
</w:r>
</w:p>
</w:sdtContent>
</w:sdt>
Run Code Online (Sandbox Code Playgroud)
Note the use of PAGEREF field codes pointing at bookmarks. Also note the subsequent markup <w:t>0</w:t>. When the document is opened and the field codes are updated, this zero will be replaced by the page number on which the bookmark is currently rendered.
Each time the document is paginated, the exact placement of a bookmark could change.
Once the zeros are replaced with instance-numbers, you will observe those instance-numbers in the markup. However, these numbers are simply the last rendered values for those field codes.
In the document settings, you can prompt the user to update field codes upon opening, so that the TOC numbers will accurately reflect the current on-screen rendering. To do so, your settings file should resemble:
<w:settings ...namespaces ommitted...>
<w:updateFields w:val="true"/>
...other settings ommitted...
</w:settings>
Run Code Online (Sandbox Code Playgroud)
In the end, you still need to render the OpenXML document with a word processor, but you avoid the complexity of calculating page positions.
| 归档时间: |
|
| 查看次数: |
2274 次 |
| 最近记录: |