在docx中替换文本并使用python-docx保存更改的文件

Question

在docx中替换文本并使用python-docx保存更改的文件

use*_*248 10 python ms-word docx python-docx

我正在尝试使用python-docx模块替换文件中的单词并保存新文件,但需要注意新文件必须具有与旧文件完全相同的格式,但替换单词.我该怎么做？

docx模块有一个saveocx,它有7个输入:

文献
coreprops
appprops
CONTENTTYPES
websettings
wordrelationships
产量

除了替换的单词外,如何将原始文件中的所有内容保持不变？

Answer 1

小智 9

这对我有用:

def docx_replace(old_file,new_file,rep):
    zin = zipfile.ZipFile (old_file, 'r')
    zout = zipfile.ZipFile (new_file, 'w')
    for item in zin.infolist():
        buffer = zin.read(item.filename)
        if (item.filename == 'word/document.xml'):
            res = buffer.decode("utf-8")
            for r in rep:
                res = res.replace(r,rep[r])
            buffer = res.encode("utf-8")
        zout.writestr(item, buffer)
    zout.close()
    zin.close()

Run Code Online (Sandbox Code Playgroud)

Answer 2

edi*_*999 5

看起来，Docx for Python并不意味着存储带有图像，标头，...的完整Docx，而仅包含文档的内部内容。因此，没有简单的方法可以做到这一点。

Howewer，这是您的操作方法：

首先，看看docx标签wiki：

它说明了如何解压缩docx文件：这是典型文件的外观：

+--docProps
|  +  app.xml
|  \  core.xml
+  res.log
+--word //this folder contains most of the files that control the content of the document
|  +  document.xml //Is the actual content of the document
|  +  endnotes.xml
|  +  fontTable.xml
|  +  footer1.xml //Containst the elements in the footer of the document
|  +  footnotes.xml
|  +--media //This folder contains all images embedded in the word
|  |  \  image1.jpeg
|  +  settings.xml
|  +  styles.xml
|  +  stylesWithEffects.xml
|  +--theme
|  |  \  theme1.xml
|  +  webSettings.xml
|  \--_rels
|     \  document.xml.rels //this document tells word where the images are situated
+  [Content_Types].xml
\--_rels
   \  .rels

Run Code Online (Sandbox Code Playgroud)

docx仅在opendocx方法中获得文档的一部分

def opendocx(file): '''Open a docx file, return a document XML tree''' mydoc = zipfile.ZipFile(file) xmlcontent = mydoc.read('word/document.xml') document = etree.fromstring(xmlcontent) return document
Run Code Online (Sandbox Code Playgroud)
它仅获取document.xml文件。

我建议您做的是：

使用** opendocx *获取文档内容

用advReplace方法替换document.xml

以zip格式打开docx，然后用新的xml内容替换document.xml内容。

关闭并输出压缩文件（将其重命名为output.docx）

如果您安装了node.js，则被告知我已经在DocxGenJS（它是docx文档的模板引擎）上工作，该库正在积极开发中，并将作为节点模块尽快发布。

归档时间：	12 年，4 月前
查看次数：	11132 次
最近记录：	6 年，7 月前