Tom*_*Tom 5 python opencv conv-neural-network
我想测试从此处下载的预训练模型来执行 OCR 任务。下载链接,其名称为 CRNN_VGG_BiLSTM_CTC.onnx。该模型是从此处提取的。Sample-image.png 可以从这里下载(参见下面的代码)。
当我在 blob 中进行神经网络的前向预测 (ocr) 时,出现以下错误:
错误:OpenCV(4.4.0) /tmp/pip-req-build-xgme2194/opencv/modules/dnn/src/layers/convolution_layer.cpp:348:错误:(-215:断言失败)ngroups > 0 && inpCn %函数“getMemoryShapes”中的 ngroups == 0 && outCn % ngroups == 0
请随意阅读下面的代码。我尝试了很多东西,这很奇怪,因为这个模型不需要预定的输入形状。如果你知道如何阅读这个模型并进行转发,它也会很有帮助,但我宁愿使用 OpenCV 来解决。
import cv2 as cv
# The model is downloaded from here https://drive.google.com/drive/folders/1cTbQ3nuZG-EKWak6emD_s8_hHXWz7lAr
# model path
modelRecognition = os.path.join(MODELS_PATH,'CRNN_VGG_BiLSTM_CTC.onnx')
# read net
recognizer = cv.dnn.readNetFromONNX(modelRecognition)
# Download sample_image.png from https://i.ibb.co/fMmCB7J/sample-image.png (image host website)
sample_image = cv.imread('sample-image.png')
# Height , Width and number of channels of the image
H, W, C = sample_image.shape
# Create a 4D blob from cropped image
blob = cv.dnn.blobFromImage(sample_image, size = (H, W))
recognizer.setInput(blob)
# Here is where i get the errror that I mentioned before
result = recognizer.forward()
Run Code Online (Sandbox Code Playgroud)
预先非常感谢您。
您的问题实际上是您提供给模型的输入数据与模型训练数据的形状不匹配。
我用这个答案来检查你的 onnx 模型,它似乎需要 shape 的输入(1, 1, 32, 100)。我修改了您的代码以将图像重塑为1 x 32 x 100像素,并且推理实际上运行没有错误。
我添加了一些代码来解释推理结果。现在,我们显示图像和推断的 OCR 文本。这似乎不起作用,但是阅读OpenCV 上的教程,应该有两个模型:
alphabet_36.txt与预训练模型一起提供的文件。我不清楚使用哪个网络进行文本检测。希望下面编辑的代码可以帮助您进一步开发您的应用程序。
import cv2 as cv
import os
import numpy as np
import matplotlib.pyplot as plt
# The model is downloaded from here https://drive.google.com/drive/folders/1cTbQ3nuZG-EKWak6emD_s8_hHXWz7lAr
# model path
MODELS_PATH = './'
modelRecognition = os.path.join(MODELS_PATH,'CRNN_VGG_BiLSTM_CTC.onnx')
# read net
recognizer = cv.dnn.readNetFromONNX(modelRecognition)
# Download sample_image.png from https://i.ibb.co/fMmCB7J/sample-image.png (image host website)
sample_image = cv.imread('sample-image.png', cv.IMREAD_GRAYSCALE)
sample_image = cv.resize(sample_image, (100, 32))
sample_image = sample_image[:,::-1].transpose()
# Height and Width of the image
H,W = sample_image.shape
# Create a 4D blob from image
blob = cv.dnn.blobFromImage(sample_image, size=(H,W))
recognizer.setInput(blob)
# network inference
result = recognizer.forward()
# load alphabet
with open('alphabet_36.txt') as f:
alphabet = f.readlines()
alphabet = [f.strip() for f in alphabet]
# interpret inference results
res = []
for i in range(result.shape[0]):
ind = np.argmax(result[i,0])
res.append(alphabet[ind])
ocrtxt = ''.join(res)
# show image and detected OCR characters
plt.imshow(sample_image)
plt.title(ocrtxt)
plt.show()
Run Code Online (Sandbox Code Playgroud)
希望能帮助到你。干杯
| 归档时间: |
|
| 查看次数: |
2711 次 |
| 最近记录: |