如何使用openxml合并具有不同标题的word文档?

Bur*_*ort 0 ms-word openxml openxml-sdk

我正在尝试按照另一篇文章中发布的示例将多个文档合并为一个文档。\n我正在使用AltChunk altChunk = new AltChunk(). 当文档被合并时,它似乎并没有保留每个文档的单独的听众。合并后的文档将包含合并期间第一个文档的标题。如果要合并的第一个文档不包含听者,则新合并文档的所有其余部分将不包含标题,反之亦然。

\n\n

我的问题是,如何保留正在合并的文档的不同标题?

\n\n

将多个 Word 文档合并为一个 Open Xml

\n\n
using System;\nusing System.IO;\nusing System.Linq;\nusing DocumentFormat.OpenXml.Packaging;\nusing DocumentFormat.OpenXml.Wordprocessing;\n\nnamespace WordMergeProject\n{\n    public class Program\n    {\n        private static void Main(string[] args)\n        {\n            byte[] word1 = File.ReadAllBytes(@"..\\..\\word1.docx");\n            byte[] word2 = File.ReadAllBytes(@"..\\..\\word2.docx");\n\n            byte[] result = Merge(word1, word2);\n\n            File.WriteAllBytes(@"..\\..\\word3.docx", result);\n        }\n\n        private static byte[] Merge(byte[] dest, byte[] src)\n        {\n            string altChunkId = "AltChunkId" + DateTime.Now.Ticks.ToString();\n\n            var memoryStreamDest = new MemoryStream();\n            memoryStreamDest.Write(dest, 0, dest.Length);\n            memoryStreamDest.Seek(0, SeekOrigin.Begin);\n            var memoryStreamSrc = new MemoryStream(src);\n\n            using (WordprocessingDocument doc = WordprocessingDocument.Open(memoryStreamDest, true))\n            {\n                MainDocumentPart mainPart = doc.MainDocumentPart;\n                AlternativeFormatImportPart altPart =\n                    mainPart.AddAlternativeFormatImportPart(AlternativeFormatImportPartType.WordprocessingML, altChunkId);\n                altPart.FeedData(memoryStreamSrc);\n                var altChunk = new AltChunk();\n                altChunk.Id = altChunkId;\n                              OpenXmlElement lastElem = mainPart.Document.Body.Elements<AltChunk>().LastOrDefault();\n            if(lastElem == null)\n            {\n                lastElem = mainPart.Document.Body.Elements<Paragraph>().Last();\n            }\n\n\n            //Page Brake einf\xc3\xbcgen\n            Paragraph pageBreakP = new Paragraph();\n            Run pageBreakR = new Run();\n            Break pageBreakBr = new Break() { Type = BreakValues.Page };\n\n            pageBreakP.Append(pageBreakR);\n            pageBreakR.Append(pageBreakBr);                \n\n            return memoryStreamDest.ToArray();\n        }\n    }\n}\n
Run Code Online (Sandbox Code Playgroud)\n

Cin*_*ter 5

几年前就遇到过这个问题,并花了相当长的时间;我最终写了一篇博客文章,链接到示例文件。使用 Alt-Chunk 实现文件与页眉和页脚的集成并不简单。我将在这里尝试涵盖要点。根据页眉和页脚包含的内容类型(假设 Microsoft 没有解决我最初遇到的任何问题),可能无法单独依赖 AltChunk。

\n\n

(另请注意,可能有工具/API 可以处理此问题 - 我不知道,在本网站上询问会是题外话。)

\n\n

背景

\n\n

在解决问题之前,了解 Word 如何处理不同的页眉和页脚会有所帮助。要感受一下它,请启动 Word...

\n\n

分节符/取消链接页眉/页脚

\n\n
    \n
  • 在页面上输入一些文本并插入标题
  • \n
  • 将焦点移至页面末尾并转到Page Layout功能区中的选项卡
  • \n
  • 页面设置/分隔符/下一页分节符
  • \n
  • 进入此页面的标题区域并记下蓝色“标签”中的信息:您将在左侧看到一个部分标识符,在右侧看到“与上一个相同”。默认设置为“与上一个相同”,要创建不同的标题,请单击标题中的“链接到上一个”按钮
  • \n
\n\n

所以,规则是:

\n\n
\n

需要使用分节符和未链接的页眉(和/或页脚)\n,以便在文档中具有不同的页眉/页脚内容。

\n
\n\n

主/子文件

\n\n

Word 有一个(臭名昭著)的功能,称为“主文档”,可以将外部(“子”)文档链接到“主”文档。这样做会自动添加必要的分节符并取消页眉/页脚的链接,以便保留原始内容。

\n\n
    \n
  • 转到 Word 的大纲视图
  • \n
  • 点击“显示文档”
  • \n
  • 使用“插入”插入其他文件
  • \n
\n\n

请注意,插入了两个分节符,一个为“下一页”类型,另一个为“连续”类型。第一个插入到传入的文件中;“主”文件中的第二个。

\n\n
\n

插入文件时需要两个分节符,因为最后一个段落标记(包含文档末尾的分节符)不会转移到目标文档。目标文档中的分节符携带用于取消传入标头与目标文档中已有标头的链接的信息。

\n
\n\n

当主文档被保存、关闭并重新打开时,子文档处于“折叠”状态(文件名作为超链接而不是内容)。可以通过返回大纲视图并单击“展开”按钮来展开它们。要将子文档完全合并到文档中,请单击子文档旁边左上角的图标,然后单击“取消链接”。

\n\n

合并 Word Open XML 文件

\n\n

这就是 Open XML SDK 在合并需要保留页眉和页脚的文件时需要创建的环境类型。从理论上讲,任何一种方法都应该有效。实际上,我在仅使用分节符时遇到了问题;我从未测试过使用 Word Open XML 中的主文档功能。

\n\n

插入分节符

\n\n

这是在使用 引入文件之前插入分节符和取消链接标头的基本代码AltChunk。看看我以前的帖子和文章,只要不涉及复杂的页码,它就可以工作:

\n\n
private void btnMergeWordDocs_Click(object sender, EventArgs e)\n{\n    string sourceFolder = @"C:\\Test\\MergeDocs\\";\n    string targetFolder = @"C:\\Test\\";\n\n    string altChunkIdBase = "acID";\n    int altChunkCounter = 1;\n    string altChunkId = altChunkIdBase + altChunkCounter.ToString();\n\n    MainDocumentPart wdDocTargetMainPart = null;\n    Document docTarget = null;\n    AlternativeFormatImportPartType afType;\n    AlternativeFormatImportPart chunk = null;\n    AltChunk ac = null;\n    using (WordprocessingDocument wdPkgTarget = WordprocessingDocument.Create(targetFolder + "mergedDoc.docx", DocumentFormat.OpenXml.WordprocessingDocumentType.Document, true))\n    {\n        //Will create document in 2007 Compatibility Mode.\n        //In order to make it 2010 a Settings part must be created and a CompatMode element for the Office version set.\n        wdDocTargetMainPart = wdPkgTarget.MainDocumentPart;\n        if (wdDocTargetMainPart == null)\n        {\n            wdDocTargetMainPart = wdPkgTarget.AddMainDocumentPart();\n            Document wdDoc = new Document(\n                new Body(\n                    new Paragraph(\n                        new Run(new Text() { Text = "First Para" })),\n                        new Paragraph(new Run(new Text() { Text = "Second para" })),\n                        new SectionProperties(\n                            new SectionType() { Val = SectionMarkValues.NextPage },\n                            new PageSize() { Code = 9 },\n                            new PageMargin() { Gutter = 0, Bottom = 1134, Top = 1134, Left = 1318, Right = 1318, Footer = 709, Header = 709 },\n                            new Columns() { Space = "708" },\n                            new TitlePage())));\n            wdDocTargetMainPart.Document = wdDoc;\n        }\n        docTarget = wdDocTargetMainPart.Document;\n        SectionProperties secPropLast = docTarget.Body.Descendants<SectionProperties>().Last();\n        SectionProperties secPropNew = (SectionProperties)secPropLast.CloneNode(true);\n        //A section break must be in a ParagraphProperty\n        Paragraph lastParaTarget = (Paragraph)docTarget.Body.Descendants<Paragraph>().Last();\n        ParagraphProperties paraPropTarget = lastParaTarget.ParagraphProperties;\n        if (paraPropTarget == null)\n        {\n            paraPropTarget = new ParagraphProperties();\n        }\n        paraPropTarget.Append(secPropNew);\n        Run paraRun = lastParaTarget.Descendants<Run>().FirstOrDefault();\n        //lastParaTarget.InsertBefore(paraPropTarget, paraRun);\n        lastParaTarget.InsertAt(paraPropTarget, 0);\n\n        //Process the individual files in the source folder.\n        //Note that this process will permanently change the files by adding a section break.\n        System.IO.DirectoryInfo di = new System.IO.DirectoryInfo(sourceFolder);\n        IEnumerable<System.IO.FileInfo> docFiles = di.EnumerateFiles();\n        foreach (System.IO.FileInfo fi in docFiles)\n        {\n            using (WordprocessingDocument pkgSourceDoc = WordprocessingDocument.Open(fi.FullName, true))\n            {\n                IEnumerable<HeaderPart> partsHeader = pkgSourceDoc.MainDocumentPart.GetPartsOfType<HeaderPart>();\n                IEnumerable<FooterPart> partsFooter = pkgSourceDoc.MainDocumentPart.GetPartsOfType<FooterPart>();\n                //If the source document has headers or footers we want to retain them.\n                //This requires inserting a section break at the end of the document.\n                if (partsHeader.Count() > 0 || partsFooter.Count() > 0)\n                {\n                    Body sourceBody = pkgSourceDoc.MainDocumentPart.Document.Body;\n                    SectionProperties docSectionBreak = sourceBody.Descendants<SectionProperties>().Last();\n                    //Make a copy of the document section break as this won\'t be imported into the target document.\n                    //It needs to be appended to the last paragraph of the document\n                    SectionProperties copySectionBreak = (SectionProperties)docSectionBreak.CloneNode(true);\n                    Paragraph lastpara = sourceBody.Descendants<Paragraph>().Last();\n                    ParagraphProperties paraProps = lastpara.ParagraphProperties;\n                    if (paraProps == null)\n                    {\n                        paraProps = new ParagraphProperties();\n                        lastpara.Append(paraProps);\n                    }\n                    paraProps.Append(copySectionBreak);\n                }\n                pkgSourceDoc.MainDocumentPart.Document.Save();\n            }\n            //Insert the source file into the target file using AltChunk\n            afType = AlternativeFormatImportPartType.WordprocessingML;\n            chunk = wdDocTargetMainPart.AddAlternativeFormatImportPart(afType, altChunkId);\n            System.IO.FileStream fsSourceDocument = new System.IO.FileStream(fi.FullName, System.IO.FileMode.Open);\n            chunk.FeedData(fsSourceDocument);\n            //Create the chunk\n            ac = new AltChunk();\n            //Link it to the part\n            ac.Id = altChunkId;\n            docTarget.Body.InsertAfter(ac, docTarget.Body.Descendants<Paragraph>().Last());\n            docTarget.Save();\n            altChunkCounter += 1;\n            altChunkId = altChunkIdBase + altChunkCounter.ToString();\n            chunk = null;\n            ac = null;\n        }\n    }\n}\n
Run Code Online (Sandbox Code Playgroud)\n\n

如果有复杂的页码(引用自我的博客文章):

\n\n
\n

不幸的是,在将 Word 文档 \xe2\x80\x9ccunks\xe2\x80\x9d 集成到主文档中时,Word 应用程序中存在一个错误。该进程有一个坏习惯,即不保留许多SectionProperties,其中包括设置节是否具有不同首页的属性和在节中重新启动页编号的属性。只要您的文档不需要管理这些类型的页眉和页脚,您就可以使用\n \xe2\x80\x9caltChunk\xe2\x80\x9d 方法。

\n\n

但是,如果您确实需要处理复杂的页眉和页脚,当前唯一可用的方法就是将每个文档的整体、部分复制到其中。这是一项艰巨的任务,因为有许多可能的部件类型,它们不仅可以与主文档正文关联,还可以与每个页眉和页脚部件关联。

\n
\n\n

...或尝试主/子文档方法。

\n\n

主/子文件

\n\n

这种方法肯定会维护所有信息,它将作为主文档打开,但是,需要 Word API(用户或自动化代码)“取消链接”子文档以将其转换为单个集成文档。

\n\n

在 Open XML SDK Productivity Tool 中打开主文档文件表明,将子文档插入主文档是一个相当简单的过程:

\n\n

具有一个子文档的文档的基础 Word Open XML:

\n\n
<w:body xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main">\n  <w:p>\n    <w:pPr>\n      <w:pStyle w:val="Heading1" />\n    </w:pPr>\n    <w:subDoc r:id="rId6" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" />\n  </w:p>\n  <w:sectPr>\n    <w:headerReference w:type="default" r:id="rId7" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" />\n    <w:type w:val="continuous" />\n    <w:pgSz w:w="11906" w:h="16838" />\n    <w:pgMar w:top="1417" w:right="1417" w:bottom="1134" w:left="1417" w:header="708" w:footer="708" w:gutter="0" />\n    <w:cols w:space="708" />\n    <w:docGrid w:linePitch="360" />\n  </w:sectPr>\n</w:body>\n
Run Code Online (Sandbox Code Playgroud)\n\n

和代码:

\n\n
public class GeneratedClass\n{\n    // Creates an Body instance and adds its children.\n    public Body GenerateBody()\n    {\n        Body body1 = new Body();\n\n        Paragraph paragraph1 = new Paragraph();\n\n        ParagraphProperties paragraphProperties1 = new ParagraphProperties();\n        ParagraphStyleId paragraphStyleId1 = new ParagraphStyleId(){ Val = "Heading1" };\n\n        paragraphProperties1.Append(paragraphStyleId1);\n        SubDocumentReference subDocumentReference1 = new SubDocumentReference(){ Id = "rId6" };\n\n        paragraph1.Append(paragraphProperties1);\n        paragraph1.Append(subDocumentReference1);\n\n        SectionProperties sectionProperties1 = new SectionProperties();\n        HeaderReference headerReference1 = new HeaderReference(){ Type = HeaderFooterValues.Default, Id = "rId7" };\n        SectionType sectionType1 = new SectionType(){ Val = SectionMarkValues.Continuous };\n        PageSize pageSize1 = new PageSize(){ Width = (UInt32Value)11906U, Height = (UInt32Value)16838U };\n        PageMargin pageMargin1 = new PageMargin(){ Top = 1417, Right = (UInt32Value)1417U, Bottom = 1134, Left = (UInt32Value)1417U, Header = (UInt32Value)708U, Footer = (UInt32Value)708U, Gutter = (UInt32Value)0U };\n        Columns columns1 = new Columns(){ Space = "708" };\n        DocGrid docGrid1 = new DocGrid(){ LinePitch = 360 };\n\n        sectionProperties1.Append(headerReference1);\n        sectionProperties1.Append(sectionType1);\n        sectionProperties1.Append(pageSize1);\n        sectionProperties1.Append(pageMargin1);\n        sectionProperties1.Append(columns1);\n        sectionProperties1.Append(docGrid1);\n\n        body1.Append(paragraph1);\n        body1.Append(sectionProperties1);\n        return body1;\n    }\n}\n
Run Code Online (Sandbox Code Playgroud)\n