如何使用 OCR 工具从屏幕区域即时提取文本？

Question

如何使用 OCR 工具从屏幕区域即时提取文本？

Erl*_*ing 31 software-recommendation screenshot ocr 12.10

在 Ubuntu 12.10 中，如果我输入

gnome-screenshot -a | tesseract output

Run Code Online (Sandbox Code Playgroud)

它返回：

** Message: Unable to use GNOME Shell's builtin screenshot interface, resorting to fallback X11.

Run Code Online (Sandbox Code Playgroud)

如何从屏幕中选择文本并将其转换为文本（剪贴板或文档）？

谢谢！

Answer 1

Sal*_*lem 44

也许已经有一些工具可以做到这一点，但是您也可以在尝试使用时使用一些屏幕截图工具和 tesseract 创建一个简单的脚本。

以此脚本为例（在我的系统中，我将其另存为/usr/local/bin/screen_ts）：

#!/bin/bash
# Dependencies: tesseract-ocr imagemagick scrot

select tesseract_lang in eng rus equ ;do break;done
# Quick language menu, add more if you need other languages.

SCR_IMG=`mktemp`
trap "rm $SCR_IMG*" EXIT

scrot -s $SCR_IMG.png -q 100 
# increase quality with option -q from default 75 to 100
# Typo "$SCR_IMG.png000" does not continue with same name.


mogrify -modulate 100,0 -resize 400% $SCR_IMG.png 
#should increase detection rate

tesseract $SCR_IMG.png $SCR_IMG &> /dev/null
cat $SCR_IMG.txt
exit

Run Code Online (Sandbox Code Playgroud)

并具有剪贴板支持：

#!/bin/bash 
# Dependencies: tesseract-ocr imagemagick scrot xsel

select tesseract_lang in eng rus equ ;do break;done
# quick language menu, add more if you need other languages.

SCR_IMG=`mktemp`
trap "rm $SCR_IMG*" EXIT

scrot -s $SCR_IMG.png -q 100    
# increase image quality with option -q from default 75 to 100

mogrify -modulate 100,0 -resize 400% $SCR_IMG.png 
#should increase detection rate

tesseract $SCR_IMG.png $SCR_IMG &> /dev/null
cat $SCR_IMG.txt | xsel -bi

exit

Run Code Online (Sandbox Code Playgroud)

它用于scrot获取屏幕，tesseract识别文本并cat显示结果。剪贴板版本还xsel用于将输出通过管道传输到剪贴板。

示例用法

注：scrot，xsel，imagemagick和tesseract-ocr是不是默认安装的，但可从默认的存储库。

您也许可以替换scrot为gnome-screenshot，但这可能需要大量工作。关于输出，您可以使用任何可以读取文本文件的内容（使用文本编辑器打开，将识别的文本显示为通知等）。

脚本的 GUI 版本

这是 OCR 脚本的简单图形版本，包括语言选择对话框：

#!/bin/bash
# DEPENDENCIES: tesseract-ocr imagemagick scrot yad
# AUTHOR:       Glutanimate 2013 (http://askubuntu.com/users/81372/)
# NAME:         ScreenOCR
# LICENSE:      GNU GPLv3
#
# BASED ON:     OCR script by Salem (http://askubuntu.com/a/280713/81372)

TITLE=ScreenOCR # set yad variables
ICON=gnome-screenshot

# - tesseract won't work if LC_ALL is unset so we set it here
# - you might want to delete or modify this line if you 
#   have a different locale:

export LC_ALL=en_US.UTF-8

# language selection dialog
LANG=$(yad \
    --width 300 --entry --title "$TITLE" \
    --image=$ICON \
    --window-icon=$ICON \
    --button="ok:0" --button="cancel:1" \
    --text "Select language:" \
    --entry-text \
    "eng" "ita" "deu")

# - You can modify the list of available languages by editing the line above
# - Make sure to use the same ISO codes tesseract does (man tesseract for details)
# - Languages will of course only work if you have installed their respective
#   language packs (https://code.google.com/p/tesseract-ocr/downloads/list)

RET=$? # check return status

if [ "$RET" = 252 ] || [ "$RET" = 1 ]  # WM-Close or "cancel"
  then
      exit
fi

echo "Language set to $LANG"

SCR_IMG=$(mktemp) # create tempfile
trap "rm $SCR_IMG*" EXIT # make sure tempfiles get deleted afterwards

scrot -s "$SCR_IMG".png -q 100 #take screenshot of area
mogrify -modulate 100,0 -resize 400% "$SCR_IMG".png # postprocess to prepare for OCR
tesseract -l "$LANG" "$SCR_IMG".png "$SCR_IMG" # OCR in given language
xsel -bi < "$SCR_IMG".txt # pass to clipboard
exit

Run Code Online (Sandbox Code Playgroud)

除了上面列出的依赖项之外，您还需要从 webupd8 PPA安装Zenity fork YAD以使脚本工作。

Answer 2

小智 8

为此，我创建了一个免费的开源程序：

https://danpla.github.io/dpscreenocr/

很棒的工具，解决了我的问题。我确实必须从源代码进行编译才能使其在 Ubuntu 18 上运行，但这非常简单 - 只需安装依赖项，然后运行 cmake。然后只需使用热键开始选择，热键结束，您就得到了历史记录中的文本！ (3认同)
优秀的工具！（也就是说，Tesseract有时会出错，但这不是工具的问题）。我通过 launchpad.net 在 Ubuntu 上安装了它，它告诉我需要安装 Tesseract 语言。在我手动安装了几种语言（通过 aptitude）后，一切正常！使用“aptitude search tesseract”或“apt-cache search tesseract”查看语言列表。重要提示：您可以使用**相同的热键**来开始和结束文本区域选择。并在“操作”中选择“将文本复制到剪贴板”选项。 (2认同)

归档时间：	12 年，5 月前
查看次数：	17882 次
最近记录：	5 年，4 月前