在Python中计算两个图像之间的绝对差之和的最快方法是什么?

Vic*_*gos 2 python arrays numpy image-processing python-imaging-library

我正在尝试比较使用 Pillow 和(可选)Numpy 的 Python 3 应用程序中的图像。出于兼容性原因,我不打算使用其他外部非纯Python包。我在 Roseta 代码中找到了这个基于 Pillow 的算法,它可能符合我的目的,但需要一些时间:

from PIL import Image

def compare_images(img1, img2):
    """Compute percentage of difference between 2 JPEG images of same size
    (using the sum of absolute differences). Alternatively, compare two bitmaps
    as defined in basic bitmap storage. Useful for comparing two JPEG images
    saved with a different compression ratios.

    Adapted from:
    http://rosettacode.org/wiki/Percentage_difference_between_images#Python

    :param img1: an Image object
    :param img2: an Image object
    :return: A float with the percentage of difference, or None if images are
    not directly comparable.
    """

    # Don't compare if images are of different modes or different sizes.
    if (img1.mode != img2.mode) \
            or (img1.size != img2.size) \
            or (img1.getbands() != img2.getbands()):
        return None

    pairs = zip(img1.getdata(), img2.getdata())
    if len(img1.getbands()) == 1:
        # for gray-scale jpegs
        dif = sum(abs(p1 - p2) for p1, p2 in pairs)
    else:
        dif = sum(abs(c1 - c2) for p1, p2 in pairs for c1, c2 in zip(p1, p2))

    ncomponents = img1.size[0] * img1.size[1] * 3
    return (dif / 255.0 * 100) / ncomponents  # Difference (percentage)
Run Code Online (Sandbox Code Playgroud)

在尝试寻找替代方案时,我发现可以使用 Numpy 重写该函数:

import numpy as np    
from PIL import Image

def compare_images_np(img1, img2):
    if (img1.mode != img2.mode) \
            or (img1.size != img2.size) \
            or (img1.getbands() != img2.getbands()):
        return None

    dif = 0
    for band_index, band in enumerate(img1.getbands()):
        m1 = np.array([p[band_index] for p in img1.getdata()]).reshape(*img1.size)
        m2 = np.array([p[band_index] for p in img2.getdata()]).reshape(*img2.size)
        dif += np.sum(np.abs(m1-m2))

    ncomponents = img1.size[0] * img1.size[1] * 3
    return (dif / 255.0 * 100) / ncomponents  # Difference (percentage)
Run Code Online (Sandbox Code Playgroud)

我原本期望处理速度会有所提高,但实际上需要更长的时间。除了基础知识之外,我对 Numpy 没有任何经验,所以我想知道是否有任何方法可以使其更快,例如使用一些并不意味着 for 循环的算法。有任何想法吗?

Mar*_*ell 5

我想我明白你想做什么。我不知道我们两台机器的相对性能,所以也许你可以自己进行基准测试。

\n\n
from PIL import Image\nimport numpy as np\n\n# Load images, convert to RGB, then to numpy arrays and ravel into long, flat things\na=np.array(Image.open(\'a.png\').convert(\'RGB\')).ravel()\nb=np.array(Image.open(\'b.png\').convert(\'RGB\')).ravel()\n\n# Calculate the sum of the absolute differences divided by number of elements\nMAE = np.sum(np.abs(np.subtract(a,b,dtype=np.float))) / a.shape[0]\n
Run Code Online (Sandbox Code Playgroud)\n\n

其中唯一“棘手”的事情是强制将结果类型转换np.subtract()为浮点数,以确保我可以存储负数。可能值得dtype=np.int16在您的硬件上尝试一下,看看是否更快。

\n\n
\n\n

对其进行基准测试的快速方法如下。启动ipython,然后输入以下内容:

\n\n
from PIL import Image\nimport numpy as np\n\na=np.array(Image.open(\'a.png\').convert(\'RGB\')).ravel()\nb=np.array(Image.open(\'b.png\').convert(\'RGB\')).ravel()\n
Run Code Online (Sandbox Code Playgroud)\n\n

现在您可以使用以下命令对我的代码进行计时:

\n\n
%timeit np.sum(np.abs(np.subtract(a,b,dtype=np.float))) / a.shape[0]\n6.72 \xc2\xb5s \xc2\xb1 21.2 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 100000 loops each)\n
Run Code Online (Sandbox Code Playgroud)\n\n

或者,您可以尝试int16这样的版本:

\n\n
%timeit np.sum(np.abs(np.subtract(a,b,dtype=np.int16))) / a.shape[0]\n6.43 \xc2\xb5s \xc2\xb1 30.1 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 100000 loops each)\n
Run Code Online (Sandbox Code Playgroud)\n\n

如果您想对代码进行计时,请粘贴您的函数,然后使用:

\n\n
%timeit compare_images_pil(img1, img2)\n
Run Code Online (Sandbox Code Playgroud)\n