OpenCV 无法在 Google Colaboratory 中运行

sha*_*des 0 python opencv face-recognition google-colaboratory

我在google colaboratory上练习OpenCV,因为我不知道如何在GPU上使用OpenCV,当我在我的硬件上运行OpenCV时,它需要大量CPU,所以我去了Google colaboratory。我的笔记本的链接在这里

如果你不想看,代码如下:

import cv2

face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
cap = cv2.VideoCapture(0)

while True:
    _, img = cap.read()
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray, 1.1, 4)
    for (x, y, w, h) in faces:
        cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)

    cv2.imshow('img', img)

    k = cv2.waitKey(30) & 0xff
    if k==27:
        break
    
cap.release()
Run Code Online (Sandbox Code Playgroud)

相同的代码在我的 PC 上运行良好,但在 Google Colaboratory 上则不然。错误是:

---------------------------------------------------------------------------
error                                     Traceback (most recent call last)
<ipython-input-5-0d9472926d8c> in <module>()
      6 while True:
      7         _, img = cap.read()
----> 8         gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
      9         faces = face_cascade.detectMultiScale(gray, 1.1, 4)
     10         for (x, y, w, h) in faces:

error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'
Run Code Online (Sandbox Code Playgroud)

PS~我在 Google Colaboratory 的笔记本的同一目录下有 haarcascade 文件

怎么处理呢?如果没有,那么是否有任何“具体”解决方案可以在支持 CUDA 的 GPU 而不是 CPU 上运行 OpenCV?提前致谢!

fur*_*ras 5

_src.empty()意味着它从相机获取帧时遇到问题,并且imgNone它尝试时,cvtColor(None, ...)它会给出_src.empty()

您应该检查,if img is not None:因为cv2当无法从相机获取帧或从文件读取图像时不会引发错误。有时相机需要时间“预热”,并且它可能会给出一些空帧(None)。


VideoCapture(0)从直接连接到运行此代码的计算机的相机读取帧 - 当您在服务器上运行代码时Google Colaboratory,这意味着相机直接连接到服务器Google Colaboratory(不是您的本地相机),但该服务器没有相机,因此VideoCapture(0)无法工作Google Colaboratory

cv2当它在服务器上运行时,无法从本地相机获取图像。您的网络浏览器可能可以访问您的相机,但它需要 JavaScript 来获取帧并将其发送到服务器 - 但服务器需要代码来获取此帧


我在 Google 中检查了是否Google Colaboratory可以访问本地网络摄像头,似乎他们为此创建了脚本 -相机捕获- 在第一个单元格中是take_photo()用于JavaScript访问相机并在浏览器中显示的函数,在第二个单元格中此函数用于显示来自本地的图像相机并截取屏幕截图。

您应该使用此功能,而不是VideoCapture(0)使用本地摄像机在服务器上工作。


顺便说一句:亲爱的,还有take_photo()关于cv2.im_show()因为它也只适用于直接连接到运行此代码的计算机的显示器的信息(并且该计算机必须像Windows在 Windows 和X11Linux 上一样运行 GUI) - 当您在服务器上运行它时,它想要显示在直接连接到服务器的显示器上 - 但服务器通常在没有显示器的情况下工作(并且没有 GUI)

Google Colaboratory有特殊的替换显示在网络浏览器中

 from google.colab.patches import cv2_imshow
Run Code Online (Sandbox Code Playgroud)

顺便说一句:如果您在加载 haarcascades 时遇到问题.xml,那么您可能需要文件夹到文件名。cv2为此有特殊变量cv2.data.haarcascades

path = os.path.join(cv2.data.haarcascades, 'haarcascade_frontalface_default.xml')

cv2.CascadeClassifier( path )
Run Code Online (Sandbox Code Playgroud)

您还可以查看此文件夹中的内容

import os

filenames = os.listdir(cv2.data.haarcascades)
filenames = sorted(filenames)
print('\n'.join(filenames))
Run Code Online (Sandbox Code Playgroud)

编辑:

我创建了可以从本地网络摄像头逐帧获取的代码,无需使用button也不保存在文件中。问题是它很慢 - 因为它仍然需要将帧从本地网络浏览器发送到 google colab 服务器,然后再发送回本地网络浏览器

带有 JavaScript 函数的 Python 代码

#
# based on: https://colab.research.google.com/notebooks/snippets/advanced_outputs.ipynb#scrollTo=2viqYx97hPMi
#

from IPython.display import display, Javascript
from google.colab.output import eval_js
from base64 import b64decode, b64encode
import numpy as np

def init_camera():
  """Create objects and functions in HTML/JavaScript to access local web camera"""

  js = Javascript('''

    // global variables to use in both functions
    var div = null;
    var video = null;   // <video> to display stream from local webcam
    var stream = null;  // stream from local webcam
    var canvas = null;  // <canvas> for single frame from <video> and convert frame to JPG
    var img = null;     // <img> to display JPG after processing with `cv2`

    async function initCamera() {
      // place for video (and eventually buttons)
      div = document.createElement('div');
      document.body.appendChild(div);

      // <video> to display video
      video = document.createElement('video');
      video.style.display = 'block';
      div.appendChild(video);

      // get webcam stream and assing to <video>
      stream = await navigator.mediaDevices.getUserMedia({video: true});
      video.srcObject = stream;

      // start playing stream from webcam in <video>
      await video.play();

      // Resize the output to fit the video element.
      google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);

      // <canvas> for frame from <video>
      canvas = document.createElement('canvas');
      canvas.width = video.videoWidth;
      canvas.height = video.videoHeight;
      //div.appendChild(input_canvas); // there is no need to display to get image (but you can display it for test)

      // <img> for image after processing with `cv2`
      img = document.createElement('img');
      img.width = video.videoWidth;
      img.height = video.videoHeight;
      div.appendChild(img);
    }

    async function takeImage(quality) {
      // draw frame from <video> on <canvas>
      canvas.getContext('2d').drawImage(video, 0, 0);

      // stop webcam stream
      //stream.getVideoTracks()[0].stop();

      // get data from <canvas> as JPG image decoded base64 and with header "data:image/jpg;base64,"
      return canvas.toDataURL('image/jpeg', quality);
      //return canvas.toDataURL('image/png', quality);
    }

    async function showImage(image) {
      // it needs string "-DATA-ENCODED-BASE64"
      // it will replace previous image in `<img src="">`
      img.src = image;
      // TODO: create <img> if doesn't exists, 
      // TODO: use `id` to use different `<img>` for different image - like `name` in `cv2.imshow(name, image)`
    }

  ''')

  display(js)
  eval_js('initCamera()')

def take_frame(quality=0.8):
  """Get frame from web camera"""

  data = eval_js('takeImage({})'.format(quality))  # run JavaScript code to get image (JPG as string base64) from <canvas>

  header, data = data.split(',')  # split header ("data:image/jpg;base64,") and base64 data (JPG)
  data = b64decode(data)  # decode base64
  data = np.frombuffer(data, dtype=np.uint8)  # create numpy array with JPG data

  img = cv2.imdecode(data, cv2.IMREAD_UNCHANGED)  # uncompress JPG data to array of pixels

  return img

def show_frame(img, quality=0.8):
  """Put frame as <img src="data:image/jpg;base64,...."> """

  ret, data = cv2.imencode('.jpg', img)  # compress array of pixels to JPG data

  data = b64encode(data)  # encode base64
  data = data.decode()  # convert bytes to string
  data = 'data:image/jpg;base64,' + data  # join header ("data:image/jpg;base64,") and base64 data (JPG)

  eval_js('showImage("{}")'.format(data))  # run JavaScript code to put image (JPG as string base64) in <img>
                                           # argument in `showImage` needs `" "` 
Run Code Online (Sandbox Code Playgroud)

以及在循环中使用它的代码

# 
# based on: https://colab.research.google.com/notebooks/snippets/advanced_outputs.ipynb#scrollTo=zo9YYDL4SYZr
#

#from google.colab.patches import cv2_imshow  # I don't use it but own function `show_frame()`

import cv2
import os

face_cascade = cv2.CascadeClassifier(os.path.join(cv2.data.haarcascades, 'haarcascade_frontalface_default.xml'))

# init JavaScript code
init_camera()

while True:
    try:
        img = take_frame()

        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        #cv2_imshow(gray)  # it creates new image for every frame (it doesn't replace previous image) so it is useless
        #show_frame(gray)  # it replace previous image

        faces = face_cascade.detectMultiScale(gray, 1.1, 4)

        for (x, y, w, h) in faces:
                cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
        
        #cv2_imshow(img)  # it creates new image for every frame (it doesn't replace previous image) so it is useless
        show_frame(img)  # it replace previous image
        
    except Exception as err:
        print('Exception:', err)
Run Code Online (Sandbox Code Playgroud)

我不使用from google.colab.patches import cv2_imshow,因为它总是在页面上添加新图像而不是替换现有图像。


与 Google Colab 上的 Notebook 相同的代码:

https://colab.research.google.com/drive/1j7HTapCLx7BQUBp3USiQPZkA0zBKgLM0?usp=sharing