ala*_*ris 3

关键点格式描述如下

https://cocodataset.org/#format-data

特别是这一行

annotation{
    "keypoints" : [x1,y1,v1,...],
    ...
}
Run Code Online (Sandbox Code Playgroud)

说关键点是一个数组x1,y1,v1,...

yolov7-pose 官方github https://github.com/WongKinYiu/yolov7/tree/pose 有下载准备好的COCO数据集的链接 [MS COCO 2017的Keypoints Labels] 下载它,打开并转到目录labels\train2017。您可以打开任何文件txt,您将看到类似这样的行

0 0.671279 0.617945 0.645759 0.726859 0.519751 0.381250 2.000000 0.550936 0.348438 2.000000 0.488565 0.367188 2.000000 0.642412 0.354687 2.000000 0.488565 0.395313 2.000000 0.738046 0.526563 2.000000 0.446985 0.534375 2.000000 0.846154 0.771875 2.000000 0.442827 0.812500 2.000000 0.925156 0.964063 2.000000 0.507277 0.698438 2.000000 0.702703 0.942187 2.000000 0.555094 0.950000 2.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
Run Code Online (Sandbox Code Playgroud)

该行具有以下格式

class top_left_x top_left_y bottom_right_x bottom_right_y kpt1_x kpt1_y kpt1_v kpt2_x kpt2_y kpt2_v ...
Run Code Online (Sandbox Code Playgroud)

general.py这是负责加载它的代码(来自)


def xywhn2xyxy(x, w=640, h=640, padw=0, padh=0, kpt_label=False):
    # Convert nx4 boxes from [x, y, w, h] normalized to [x1, y1, x2, y2] where xy1=top-left, xy2=bottom-right
    # it does the same operation as above for the key-points
    y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
    y[:, 0] = w * (x[:, 0] - x[:, 2] / 2) + padw  # top left x
    y[:, 1] = h * (x[:, 1] - x[:, 3] / 2) + padh  # top left y
    y[:, 2] = w * (x[:, 0] + x[:, 2] / 2) + padw  # bottom right x
    y[:, 3] = h * (x[:, 1] + x[:, 3] / 2) + padh  # bottom right y
    if kpt_label:
        num_kpts = (x.shape[1]-4)//2
        for kpt in range(num_kpts):
            for kpt_instance in range(y.shape[0]):
                if y[kpt_instance, 2 * kpt + 4]!=0:
                    y[kpt_instance, 2*kpt+4] = w * y[kpt_instance, 2*kpt+4] + padw
                if y[kpt_instance, 2 * kpt + 1 + 4] !=0:
                    y[kpt_instance, 2*kpt+1+4] = h * y[kpt_instance, 2*kpt+1+4] + padh
    return y
Run Code Online (Sandbox Code Playgroud)

这是从调用的

labels[:, 1:] = xywhn2xyxy(labels[:, 1:], ratio[0] * w, ratio[1] * h, padw=pad[0], padh=pad[1], kpt_label=self.kpt_label)
Run Code Online (Sandbox Code Playgroud)

请注意1中的偏移量labels[:, 1:],它省略了类标签。标签坐标必须按照此处所述进行标准化

assert (l[:, 5::3] <= 1).all(), 'non-normalized or out of bounds coordinate labels'
assert (l[:, 6::3] <= 1).all(), 'non-normalized or out of bounds coordinate labels'                            
Run Code Online (Sandbox Code Playgroud)

正确设置标签格式是唯一棘手的部分。剩下的就是将图像存储在正确的目录中。结构是

images/
    train/
        file_name1.jpg
        ...
    test/
    val/
labels/
    train/
        file_name1.txt
        ...
    test/
    val/
train.txt
test.txt
val.txt
Run Code Online (Sandbox Code Playgroud)

其中train.txt包含图像的路径。它的内容看起来像这样

./images/train/file_name1.jpg
...
Run Code Online (Sandbox Code Playgroud)