确定图像是否存在于较大的图像中,如果存在,则使用Python找到它

Question

确定图像是否存在于较大的图像中,如果存在,则使用Python找到它

clj*_*clj 6 python graphics opencv numpy image-processing

我需要一个Python程序,我正在努力拍摄一个小图像,确定它是否存在于较大的图像中,如果存在,则报告其位置.如果没有,请报告.(在我的例子中,大图像将是一个屏幕截图,小图像是一个可能在屏幕上或不在屏幕上的图像,在HTML5画布中.)在线看,我发现了OpenCV中的模板匹配,拥有出色的Python绑定.基于我在线找到的非常相似的代码,我尝试了以下代码,使用numpy:

import cv2
import numpy as np
image = cv2.imread("screenshot.png")
template = cv2.imread("button.png")
result = cv2.matchTemplate(image,template,cv2.TM_CCOEFF_NORMED)
StartButtonLocation = np.unravel_index(result.argmax(),result.shape)

Run Code Online (Sandbox Code Playgroud)

这不是我需要它做的事情,因为它总是在较大的图像中返回一个点; 无论比赛多么糟糕,比赛最接近的地方.我想要找到一个精确的像素,用于较大图像中较小图像的像素匹配,如果不存在,则引发异常,或返回False,或类似的东西.而且,它需要相当快.有没有人对如何做到这一点有个好主意？

Answer 1

Ima*_*ngo 17

如果您正在寻找exact match尺寸和图像值,我会提出一个快速而完美的答案.

我们的想法是在更大的图像中计算所需h x w 模板的强力搜索H x W.强力方法包括查看h x w图像上的所有可能窗口,并检查模板内的逐像素对应.然而,这在计算上非常昂贵,但可以加速.

im = np.atleast_3d(im)
H, W, D = im.shape[:3]
h, w = tpl.shape[:2]

Run Code Online (Sandbox Code Playgroud)

通过使用智能积分图像,可以非常快速地计算h x w从每个像素开始的窗口内部的总和.积分图像是一个求和区域表(累积求和数组),可以用numpy非常快速地计算:

sat = im.cumsum(1).cumsum(0)

Run Code Online (Sandbox Code Playgroud)

并且它具有非常好的属性,例如仅使用4个算术运算计算窗口中所有值的总和:

来自维基百科

因此,通过计算模板的总和并将其与h x w积分图像上的窗口之和进行匹配,很容易找到"可能窗口"的列表,其中内部值的总和与该值的总和相同.模板(快速近似).

iA, iB, iC, iD = sat[:-h, :-w], sat[:-h, w:], sat[h:, :-w], sat[h:, w:]
lookup = iD - iB - iC + iA

Run Code Online (Sandbox Code Playgroud)

以上是图像中显示的对图像上所有可能的h x w矩形的操作的numpy矢量化(因此,非常快).

这将减少很多可能的窗口(在我的一个测试中为2).最后一步是检查与模板的完全匹配:

posible_match = np.where(np.logical_and.reduce([lookup[..., i] == tplsum[i] for i in range(D)]))
for y, x in zip(*posible_match):
    if np.all(im[y+1:y+h+1, x+1:x+w+1] == tpl):
        return (y+1, x+1)

Run Code Online (Sandbox Code Playgroud)

请注意,此处y和x坐标对应于图像中的A点,该点是模板的上一行和列.

全部放在一起:

def find_image(im, tpl):
    im = np.atleast_3d(im)
    tpl = np.atleast_3d(tpl)
    H, W, D = im.shape[:3]
    h, w = tpl.shape[:2]

    # Integral image and template sum per channel
    sat = im.cumsum(1).cumsum(0)
    tplsum = np.array([tpl[:, :, i].sum() for i in range(D)])

    # Calculate lookup table for all the possible windows
    iA, iB, iC, iD = sat[:-h, :-w], sat[:-h, w:], sat[h:, :-w], sat[h:, w:] 
    lookup = iD - iB - iC + iA
    # Possible matches
    possible_match = np.where(np.logical_and.reduce([lookup[..., i] == tplsum[i] for i in range(D)]))

    # Find exact match
    for y, x in zip(*possible_match):
        if np.all(im[y+1:y+h+1, x+1:x+w+1] == tpl):
            return (y+1, x+1)

    raise Exception("Image not found")

Run Code Online (Sandbox Code Playgroud)

它适用于灰度和彩色图像,可运行带有模板7ms的303x384彩色图像50x50.

一个实际的例子:

>>> from skimage import data
>>> im = gray2rgb(data.coins())
>>> tpl = im[170:220, 75:130].copy()

>>> y, x = find_image(im, tpl)
>>> y, x
(170, 75)

Run Code Online (Sandbox Code Playgroud)

并且说明结果:

在此输入图像描述

左侧原始图像,右侧模板.这里完全匹配:

>>> fig, ax = plt.subplots()
>>> imshow(im)
>>> rect = Rectangle((x, y), tpl.shape[1], tpl.shape[0], edgecolor='r', facecolor='none')
>>> ax.add_patch(rect)

Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

最后,只是possible_matches测试的一个例子:

在此输入图像描述

图像中两个窗口的总和是相同的,但函数的最后一步过滤了与模板不完全匹配的那个窗口.

`np.logical_and(*[...])` - 你期望`np.logical_and`一起把任意数量的东西带到AND吗？NumPy ufuncs不会那样工作.您实际上是将第三个列表元素指定为放置输出的数组,而不是将另一个数组指定为AND.ufunc [`reduce`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ufunc.reduce.html)方法可能有所帮助:`np.logical_and.reduce([...] )`(没有`*`)而不是`np.logical_and(*[...])`. (2认同)

归档时间：	10 年，11 月前
查看次数：	7332 次
最近记录：	8 年，5 月前