从演示文件中提取图像

Aur*_*uro 2 python python-2.7 python-pptx

我正在研究python-pptx包。对于我的代码,我需要提取演示文稿文件中存在的所有图像。有人可以帮助我吗?

在此先感谢您的帮助。

我的代码如下所示:

import pptx
Run Code Online (Sandbox Code Playgroud)

prs = pptx.Presentation(filename)

for slide in prs.slides:
    for shape in slide.shapes:
        print(shape.shape_type)
Run Code Online (Sandbox Code Playgroud)

使用shape_type时,显示的是ppt中存在的PICTURE(13)。但是我想将图片提取到存在代码的文件夹中。

sca*_*nny 5

中的Picture(形状)对象python-pptx提供对其显示图像的访问:

from pptx import Presentation
from pptx.enum.shapes import MSO_SHAPE_TYPE

def iter_picture_shapes(prs):
    for slide in prs.slides:
        for shape in slide.shapes:
            if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
                yield shape

for picture in iter_picture_shapes(Presentation(filename)):
    image = picture.image
    # ---get image "file" contents---
    image_bytes = image.blob
    # ---make up a name for the file, e.g. 'image.jpg'---
    image_filename = 'image.%s' % image.ext
    with open(image_filename, 'wb') as f:
        f.write(image_bytes)
Run Code Online (Sandbox Code Playgroud)

作为练习,留给您生成唯一的文件名。您需要的所有其他位都在这里。

有关Image对象的更多详细信息,请参见此处的文档:https :
//python-pptx.readthedocs.io/en/latest/api/image.html#image-objects


小智 5

scanny 的解决方案对我不起作用,因为我在组元素中有图像元素。这对我有用:

from pptx import Presentation
from pptx.enum.shapes import MSO_SHAPE_TYPE

n=0
def write_image(shape):
    global n
    image = shape.image
    # ---get image "file" contents---
    image_bytes = image.blob
    # ---make up a name for the file, e.g. 'image.jpg'---
    image_filename = 'image{:03d}.{}'.format(n, image.ext)
    n += 1
    print(image_filename)
    with open(image_filename, 'wb') as f:
        f.write(image_bytes)

def visitor(shape):
    if shape.shape_type == MSO_SHAPE_TYPE.GROUP:
        for s in shape.shapes:
            visitor(s)
    if shape.shape_type == MSO_SHAPE_TYPE.PICTURE:
        write_image(shape)

def iter_picture_shapes(prs):
    for slide in prs.slides:
        for shape in slide.shapes:
            visitor(shape)

iter_picture_shapes(Presentation(filename))
Run Code Online (Sandbox Code Playgroud)