使用Open Xml替换Word文档中的文本

Jul*_*ary 14 templates openxml c#-4.0

我已经从word模板创建了一个docx文件,现在我正在访问复制的docx文件,并希望用某些其他数据替换某些文本.

我无法获得如何从doument主要部分访问文本的提示?

任何帮助都会很明显.

以下是我的代码到现在为止.

private void CreateSampleWordDocument()
    {
        //string sourceFile = Path.Combine("D:\\GeneralLetter.dot");
        //string destinationFile = Path.Combine("D:\\New.doc");
        string sourceFile = Path.Combine("D:\\GeneralWelcomeLetter.docx");
        string destinationFile = Path.Combine("D:\\New.docx");
        try
        {
            // Create a copy of the template file and open the copy
            File.Copy(sourceFile, destinationFile, true);
            using (WordprocessingDocument document = WordprocessingDocument.Open(destinationFile, true))
            {
                // Change the document type to Document
                document.ChangeDocumentType(DocumentFormat.OpenXml.WordprocessingDocumentType.Document);
                //Get the Main Part of the document
                MainDocumentPart mainPart = document.MainDocumentPart;
                mainPart.Document.Save();
            }
        }
        catch
        {
        }
    }
Run Code Online (Sandbox Code Playgroud)

现在如何查找某些文本并替换相同的文本?我无法通过链接,所以一些代码提示将是可观的.

Flo*_*ing 17

只是为了让您了解如何操作,请尝试:

  using ( WordprocessingDocument doc =
                    WordprocessingDocument.Open(@"yourpath\testdocument.docx", true))
            {
                var body = doc.MainDocumentPart.Document.Body;
                var paras = body.Elements<Paragraph>();

                foreach (var para in paras)
                {
                    foreach (var run in para.Elements<Run>())
                    {
                        foreach (var text in run.Elements<Text>())
                        {
                            if (text.Text.Contains("text-to-replace"))
                            {
                                text.Text = text.Text.Replace("text-to-replace", "replaced-text");
                            }
                        }
                    }
                }
            }
        }
Run Code Online (Sandbox Code Playgroud)

请注意,文本区分大小写.替换后不会更改文本格式.希望这对你有所帮助.

  • 这只会在一次运行中替换文本.但是,文本可能会在不同的运行中被切断,在更换之前必须连接fisrt. (8认同)

ser*_*dat 9

除了Flowerking回答:

当您的doc文件中包含文本框时,该过程将不起作用.因为textbox具有TextBoxContent元素,所以它不会出现在foreach循环中.

但写作时

using ( WordprocessingDocument doc =
                    WordprocessingDocument.Open(@"yourpath\testdocument.docx", true))
{
    var document = doc.MainDocumentPart.Document

    foreach (var text in document.Descendants<Text>()) // <<< Here
    {
        if (text.Text.Contains("text-to-replace"))
        {
            text.Text = text.Text.Replace("text-to-replace", "replaced-text");
        }
    } 
}
Run Code Online (Sandbox Code Playgroud)

它将循环文档中的所有文本(无论是否在文本框中),因此它将替换文本.

  • 注意:你需要`使用DocumentFormat.OpenXml.Wordprocessing`(我的intellisense提出了一堆其他的东西). (3认同)

Dmi*_*nin 7

我的类用于替换 Word 文档中的长短语,该单词拆分为不同的文本块:

类本身:

using System.Collections.Generic;
using System.Text;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;

namespace WebBackLibrary.Service
{
    public class WordDocumentService
    {
        private class WordMatchedPhrase
        {
            public int charStartInFirstPar { get; set; }
            public int charEndInLastPar { get; set; }

            public int firstCharParOccurance { get; set; }
            public int lastCharParOccurance { get; set; }
        }

        public WordprocessingDocument ReplaceStringInWordDocumennt(WordprocessingDocument wordprocessingDocument, string replaceWhat, string replaceFor)
        {
            List<WordMatchedPhrase> matchedPhrases = FindWordMatchedPhrases(wordprocessingDocument, replaceWhat);

            Document document = wordprocessingDocument.MainDocumentPart.Document;
            int i = 0;
            bool isInPhrase = false;
            bool isInEndOfPhrase = false;
            foreach (Text text in document.Descendants<Text>()) // <<< Here
            {
                char[] textChars = text.Text.ToCharArray();
                List<WordMatchedPhrase> curParPhrases = matchedPhrases.FindAll(a => (a.firstCharParOccurance.Equals(i) || a.lastCharParOccurance.Equals(i)));
                StringBuilder outStringBuilder = new StringBuilder();
                
                for (int c = 0; c < textChars.Length; c++)
                {
                    if (isInEndOfPhrase)
                    {
                        isInPhrase = false;
                        isInEndOfPhrase = false;
                    }

                    foreach (var parPhrase in curParPhrases)
                    {
                        if (c == parPhrase.charStartInFirstPar && i == parPhrase.firstCharParOccurance)
                        {
                            outStringBuilder.Append(replaceFor);
                            isInPhrase = true;
                        }
                        if (c == parPhrase.charEndInLastPar && i == parPhrase.lastCharParOccurance)
                        {
                            isInEndOfPhrase = true;
                        }

                    }
                    if (isInPhrase == false && isInEndOfPhrase == false)
                    {
                        outStringBuilder.Append(textChars[c]);
                    }
                }
                text.Text = outStringBuilder.ToString();
                i = i + 1;
            }

            return wordprocessingDocument;
        }

        private List<WordMatchedPhrase> FindWordMatchedPhrases(WordprocessingDocument wordprocessingDocument, string replaceWhat)
        {
            char[] replaceWhatChars = replaceWhat.ToCharArray();
            int overlapsRequired = replaceWhatChars.Length;
            int overlapsFound = 0;
            int currentChar = 0;
            int firstCharParOccurance = 0;
            int lastCharParOccurance = 0;
            int startChar = 0;
            int endChar = 0;
            List<WordMatchedPhrase> wordMatchedPhrases = new List<WordMatchedPhrase>();
            //
            Document document = wordprocessingDocument.MainDocumentPart.Document;
            int i = 0;
            foreach (Text text in document.Descendants<Text>()) // <<< Here
            {
                char[] textChars = text.Text.ToCharArray();
                for (int c = 0; c < textChars.Length; c++)
                {
                    char compareToChar = replaceWhatChars[currentChar];
                    if (textChars[c] == compareToChar)
                    {
                        currentChar = currentChar + 1;
                        if (currentChar == 1)
                        {
                            startChar = c;
                            firstCharParOccurance = i;
                        }
                        if (currentChar == overlapsRequired)
                        {
                            endChar = c;
                            lastCharParOccurance = i;
                            WordMatchedPhrase matchedPhrase = new WordMatchedPhrase
                            {
                                firstCharParOccurance = firstCharParOccurance,
                                lastCharParOccurance = lastCharParOccurance,
                                charEndInLastPar = endChar,
                                charStartInFirstPar = startChar
                            };
                            wordMatchedPhrases.Add(matchedPhrase);
                            currentChar = 0;
                        }
                    }
                    else
                    {
                        currentChar = 0;

                    }
                }
                i = i + 1;
            }

            return wordMatchedPhrases;

        }

    }
}
Run Code Online (Sandbox Code Playgroud)

以及易于使用的示例:

public void EditWordDocument(UserContents userContents)
        {
            string filePath = Path.Combine(userContents.PathOnDisk, userContents.FileName);
            WordDocumentService wordDocumentService = new WordDocumentService();
            if (userContents.ContentType.Contains("word") && File.Exists(filePath))
            {
                string saveAs = "modifiedTechWord.docx";
                //
                using (WordprocessingDocument doc = WordprocessingDocument.Open(filePath, true)) //open source word file
                {
                    Document document = doc.MainDocumentPart.Document;
                    OpenXmlPackage res = doc.SaveAs(Path.Combine(userContents.PathOnDisk, saveAs)); // copy it
                    res.Close();
                }
                using (WordprocessingDocument doc = WordprocessingDocument.Open(Path.Combine(userContents.PathOnDisk, saveAs), true)) // open copy
                {
                    string replaceWhat = "{transform:CandidateFio}";
                    string replaceFor = "ReplaceToFio";
                    var result = wordDocumentService.ReplaceStringInWordDocumennt(doc, replaceWhat, replaceFor); //replace words in copy
                }
            }
        }
Run Code Online (Sandbox Code Playgroud)


Win*_*jam 6

到目前为止,我发现的最简单、最准确的方法是使用Open-Xml-PowerTools。就我个人而言,我使用 dotnet core,所以我使用这个 nuget 包

using OpenXmlPowerTools;
// ...

protected byte[] SearchAndReplace(byte[] file, IDictionary<string, string> translations)
{
    WmlDocument doc = new WmlDocument(file.Length.ToString(), file);

    foreach (var translation in translations)
        doc = doc.SearchAndReplace(translation.Key, translation.Value, true);

    return doc.DocumentByteArray;
}
Run Code Online (Sandbox Code Playgroud)

使用示例:

var templateDoc = File.ReadAllBytes("templateDoc.docx");
var generatedDoc = SearchAndReplace(templateDoc, new Dictionary<string, string>(){
    {"text-to-replace-1", "replaced-text-1"},
    {"text-to-replace-2", "replaced-text-2"},
});
File.WriteAllBytes("generatedDoc.docx", generatedDoc);
Run Code Online (Sandbox Code Playgroud)

有关详细信息,请参阅在 Open XML WordprocessingML 文档中搜索和替换文本