如何在YOLO对象检测中获取边界框的坐标？

Question

如何在YOLO对象检测中获取边界框的坐标？

Shr*_*ram 7 object-detection computer-vision deep-learning

我需要使用YOLO对象检测来获取上图中生成的边界框坐标.

Answer 1

一个快速的解决方案是修改image.c文件以打印出边界框信息:

...
if(bot > im.h-1) bot = im.h-1;

// Print bounding box values 
printf("Bounding Box: Left=%d, Top=%d, Right=%d, Bottom=%d\n", left, top, right, bot); 
draw_box_width(im, left, top, right, bot, width, red, green, blue);
...

Run Code Online (Sandbox Code Playgroud)

说真的，非常感谢您建议 image.c。它帮助我解决了一个完全不同的问题：当在 Python 中运行 YOLO（通过 OpenCV-DNN）时，检测结果以浮点格式给出。事实上，我见过的每一篇文章在将 YOLO 浮点（中心 X/Y 和宽度/高度）转换为像素坐标时都有错误的数学。但官方 image.c 有数学计算！就在这儿！https://github.com/pjreddie/darknet/blob/810d7f797bdb2f021dbe65d2524c2ff6b8ab5c8b/src/image.c#L283-L291 - 我只需将其移植到python。:-) (3认同)

Answer 2

Wah*_*ram 5

对于 Windows 中的 python 用户：

首先...，做几个设置工作：

在环境路径中设置暗网文件夹的python路径：

PYTHONPATH = 'YOUR DARKNET FOLDER'
通过添加将 PYTHONPATH 添加到 Path 值：

%PYTHONPATH%
编辑文件coco.data中cfg folder，通过改变names文件夹变量设置为coco.names文件夹，在我的情况：

names = D:/core/darknetAB/data/coco.names

使用此设置，您可以从任何文件夹调用 darknet.py（来自alexeyAB\darknet存储库）作为您的 python 模块。

开始编写脚本：

from darknet import performDetect as scan #calling 'performDetect' function from darknet.py

def detect(str):
    ''' this script if you want only want get the coord '''
    picpath = str
    cfg='D:/core/darknetAB/cfg/yolov3.cfg' #change this if you want use different config
    coco='D:/core/darknetAB/cfg/coco.data' #you can change this too
    data='D:/core/darknetAB/yolov3.weights' #and this, can be change by you
    test = scan(imagePath=picpath, thresh=0.25, configPath=cfg, weightPath=data, metaPath=coco, showImage=False, makeImageOnly=False, initOnly=False) #default format, i prefer only call the result not to produce image to get more performance

    #until here you will get some data in default mode from alexeyAB, as explain in module.
    #try to: help(scan), explain about the result format of process is: [(item_name, convidence_rate (x_center_image, y_center_image, width_size_box, height_size_of_box))], 
    #to change it with generally used form, like PIL/opencv, do like this below (still in detect function that we create):

    newdata = []
    if len(test) >=2:
        for x in test:
            item, confidence_rate, imagedata = x
            x1, y1, w_size, h_size = imagedata
            x_start = round(x1 - (w_size/2))
            y_start = round(y1 - (h_size/2))
            x_end = round(x_start + w_size)
            y_end = round(y_start + h_size)
            data = (item, confidence_rate, (x_start, y_start, x_end, y_end), w_size, h_size)
            newdata.append(data)

    elif len(test) == 1:
        item, confidence_rate, imagedata = test[0]
        x1, y1, w_size, h_size = imagedata
        x_start = round(x1 - (w_size/2))
        y_start = round(y1 - (h_size/2))
        x_end = round(x_start + w_size)
        y_end = round(y_start + h_size)
        data = (item, confidence_rate, (x_start, y_start, x_end, y_end), w_size, h_size)
        newdata.append(data)

    else:
        newdata = False

    return newdata

Run Code Online (Sandbox Code Playgroud)

如何使用它：

table = 'D:/test/image/test1.jpg'
checking = detect(table)'

Run Code Online (Sandbox Code Playgroud)

获取坐标：

如果只有 1 个结果：

x1, y1, x2, y2 = checking[2]

如果结果很多：

for x in checking:
    item = x[0]
    x1, y1, x2, y2 = x[2]
    print(item)
    print(x1, y1, x2, y2)

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，5 月前
查看次数：	11802 次
最近记录：	6 年，7 月前