开始使用语音识别和python

Question

开始使用语音识别和python

我想知道哪里可以开始语音识别.不是图书馆或任何相当"黑盒子"的东西"而是,我想知道我在哪里可以实际制作一个简单的语音识别脚本.我做了一些搜索并发现,并不多,但我所看到的是有"声音"或音节的词典可以拼凑在一起形成文本.所以基本上我的问题是我在哪里可以开始这个？

此外,由于这有点乐观,我也可以在我的程序中使用库(现在).我看到文本库和API的一些演讲只吐出一个结果.这没关系,但这是不可能的.我当前的程序已经检查语法和输入的任何文本的所有内容,所以如果我要说,从语音到文本软件的十大结果,它可以检查每个并排除任何没有意义的文本.

Answer 1

ale*_*xis 8

如果你真的想从头开始理解语音识别,那么为python寻找一个好的信号处理软件包,然后独立于软件阅读语音识别.

但语音识别是一个非常复杂的问题(主要是因为当我们说话时,声音以各种方式相互作用).即使你从最好的语音识别库开始,你也可以亲自动手,你决不会发现自己没有更多的事可做.

Answer 2

dr.*_*eox 7

更新:这不再适用了

因为谷歌关闭了她的平台

-

你可以使用https://pypi.python.org/pypi/pygsr

$> pip install pygsr

Run Code Online (Sandbox Code Playgroud)

示例用法:

from pygsr import Pygsr
speech = Pygsr()
# duration in seconds
speech.record(3)
# select the language
phrase, complete_response = speech.speech_to_text('en_US')

print phrase

Run Code Online (Sandbox Code Playgroud)

-1因为它只是谷歌"黑匣子"的包装.不是您可以使用的工具包,以查看语音识别的工作原理. (4认同)

Answer 3

toi*_*ine 6

Pocketsphinx也是一个不错的选择.通过SWIG提供的Python绑定可以轻松集成到脚本中.

例如:

from os import environ, path
from itertools import izip

from pocketsphinx import *
from sphinxbase import *

MODELDIR = "../../../model"
DATADIR = "../../../test/data"

# Create a decoder with certain model
config = Decoder.default_config()
config.set_string('-hmm', path.join(MODELDIR, 'hmm/en_US/hub4wsj_sc_8k'))
config.set_string('-lm', path.join(MODELDIR, 'lm/en_US/hub4.5000.DMP'))
config.set_string('-dict', path.join(MODELDIR, 'lm/en_US/hub4.5000.dic'))
decoder = Decoder(config)

# Decode static file.
decoder.decode_raw(open(path.join(DATADIR, 'goforward.raw'), 'rb'))

# Retrieve hypothesis.
hypothesis = decoder.hyp()
print 'Best hypothesis: ', hypothesis.best_score, hypothesis.hypstr

print 'Best hypothesis segments: ', [seg.word for seg in decoder.seg()]

# Access N best decodings.
print 'Best 10 hypothesis: '
for best, i in izip(decoder.nbest(), range(10)):
    print best.hyp().best_score, best.hyp().hypstr

# Decode streaming data.
decoder = Decoder(config)
decoder.start_utt('goforward')
stream = open(path.join(DATADIR, 'goforward.raw'), 'rb')
while True:
    buf = stream.read(1024)
    if buf:
        decoder.process_raw(buf, False, False)
    else:
        break
decoder.end_utt()
print 'Stream decoding result:', decoder.hyp().hypstr

Run Code Online (Sandbox Code Playgroud)

Answer 4

ana*_*nik 6

对于那些想要深入了解Python语音识别主题的人来说,这里有一些链接:

http://www.slideshare.net/mchua/sigproc-selfstudy-17323823 - Python中的信号处理,包括最有趣的音频信号.

Answer 5

Noa*_*ser 5

我知道问题很老，但仅适用于将来的人们：

我使用speech_recognition-Module，我喜欢它。~~唯一的事情是，它需要Internet，因为它使用Google来识别语音。但这在大多数情况下都不是问题。~~识别效果几乎完美。

编辑：

该speech_recognition软件包不仅可以使用Google进行翻译，还可以使用CMUsphinx（允许离线识别）进行翻译。唯一的区别是识别命令中的细微变化：

https://pypi.python.org/pypi/SpeechRecognition/

这是一个小代码示例：

import speech_recognition as sr

r = sr.Recognizer()
with sr.Microphone() as source:                # use the default microphone as the audio source
    audio = r.listen(source)                   # listen for the first phrase and extract it into audio data

try:
    print("You said " + r.recognize_google(audio))    # recognize speech using Google Speech Recognition - ONLINE
    print("You said " + r.recognize_sphinx(audio))    # recognize speech using CMUsphinx Speech Recognition - OFFLINE
except LookupError:                            # speech is unintelligible
    print("Could not understand audio")

Run Code Online (Sandbox Code Playgroud)

对我来说，只有一件事是行不通的：无限循环聆听。几分钟后挂断。（它没有崩溃，只是没有响应。）

编辑：如果要使用无穷大循环的麦克风，则应指定录音长度。示例代码：

import speech_recognition as sr

r = sr.Recognizer()
with sr.Microphone() as source:
    print("Speak:")
    audio = r.listen(source, None, "time_to_record")  # recording

Run Code Online (Sandbox Code Playgroud)

这似乎不适用于我。我一直在讲话，等待它吐出结果。但是什么都没有！ (2认同)
嗯，尝试让它无限循环地监听。（Try-Except-Block进入“ while 1”循环。） (2认同)

归档时间：	13 年，5 月前
查看次数：	78122 次
最近记录：	6 年，6 月前