标签: speech-recognition

让WAV文件转录与Sphinx4一起使用

我已经在我的Windows XP系统和JSAPI设置上安装了Sphinx-4.我想将英语口语WAV(或MP3)文件转录成文本.

当我运行"WavFile"演示时 - 它成功运行.

java -jar WavFile.jar

Run Code Online (Sandbox Code Playgroud)

但是,当我传递我自己的wav文件时:

java -jar WavFile.jar c:\test.wav

Run Code Online (Sandbox Code Playgroud)

我明白了:

":文件:/ C:罐子/sphinx4-1.0beta3-bin/sphinx4-1.0beta3/bin/WavFile.jar /edu/cmu/sphinx/demo/wavfile/config.xml"中定义加载识别器...

解码的jar:文件:/ C:/sphinx4-1.0beta3-bin/sphinx4-1.0beta3/bin/WavFile.jar /edu/cmu/sphinx/demo/wavfile/12345.wav结果:一二三四五

似乎这个演示设置为加载和运行内部wav文件("12345.wav")或其他东西.

我已经阅读了文档,并且无法想象如何设置"config.xml"甚至是放置它的目录.我只是想尝试使用标准演示进行简单的概念验证.

所以,问题是:如何运行Sphinx4程序来转录wav文件？

谢谢.

speech-recognition speech-to-text cmusphinx

Jim*_*nes

2011 08-09

2
推荐指数

1
解决办法

6628
查看次数

在离线模式下Android上的语音到文本

无论如何,我可以在离线模式下使用Android的Voice to Text功能.

在给定的示例VoiceRecognition.java中,它使用目标RecognizerIntent.ACTION_RECOGNIZE_SPEECH启动和活动.

这是否意味着需要先安装任何其他apk才能使用此功能,或者我是否需要编写自己的应用程序来启动此意图.

我一直在寻找这个,但是很困惑......

这是我用过的代码..

private static final int VOICE_RECOGNITION_REQUEST_CODE = 1234;

private ListView mList;

/**
 * Called with the activity is first created.
 */
@Override
public void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);

    // Inflate our UI from its XML layout description.
    setContentView(R.layout.voice_recognition);

    // Get display items for later interaction
    Button speakButton = (Button) findViewById(R.id.btn_speak);

    mList = (ListView) findViewById(R.id.list);

    // Check to see if a recognition activity is present
    PackageManager pm = getPackageManager();
    List<ResolveInfo> activities = pm.queryIntentActivities(
            new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH), …

Run Code Online (Sandbox Code Playgroud)

android speech-recognition

ami*_*ser

2011 01-19

2
推荐指数

1
解决办法

3万
查看次数

SpeechRecognitionEngine：UnloadAllGrammars 非常慢

我正在使用 .NET (C#) 和语音识别引擎。我希望能够加载和卸载语法，因为我的状态需要它们。在我看来，这是限制获得误报机会的好方法；但是，每当我运行 UnloadAllGrammars() 时，有时需要将近一分钟才能完成。
知道为什么会发生这种情况吗？

c# speech-recognition

Ed.*_*Ed.

lucky-day

2
推荐指数

1
解决办法

490
查看次数

使用Google Api:PC版语音文本

谷歌浏览器提供语音文本(STT)和许多智能手机应用程序提供STT.它有很好的认可.

我想在Visual Studio(MFC)中使用程序,但是没有方法可以执行STT.如果我使用Google Speech To Text Api,解决这个问题就很容易了.

如果没有关于STT的公开google api,除了启动之外,请告诉我另一种方法.

mfc speech-recognition

bTa*_*ger

lucky-day

2
推荐指数

1
解决办法

1万
查看次数

在我的应用程序中更改语音识别的默认语言

我用英语制作了一个应用程序。我的应用程序使用语音识别。但是，如果我在使用其他系统语言（例如法语或俄语）的设备上安装此应用程序。我的语音识别不起作用。它仅适用于系统默认的语言。如何为我的应用程序默认使用英语进行语音识别？

我找到了这个方法，但它不起作用

Locale myLocale;
    myLocale = new Locale("English (US)", "en_US");
    Locale.setDefault(myLocale);
    android.content.res.Configuration config = new android.content.res.Configuration();
    config.locale = myLocale;
    getBaseContext().getResources().updateConfiguration(config, getBaseContext().getResources().getDisplayMetrics());

Run Code Online (Sandbox Code Playgroud)

android speech-recognition text-to-speech speech-to-text

Joh*_*Pix

2015 10-09

2
推荐指数

1
解决办法

2954
查看次数

声音文件中的 UnicodeDecodeError

我正在尝试使用 Google 语音 API 在 Python 中制作语音识别器。我一直在使用和改编这里的代码（转换为Python3）。我在计算机上使用一个音频文件，该文件已使用在线转换器从 mp3 转换为 flac 16000 Hz（如原始代码中指定）。运行代码时我收到此错误：

$ python3 speech_api.py 02-29-2016_00-12_msg1.flac 
Traceback (most recent call last):
  File "speech_api.py", line 12, in <module>
    data = f.read()
  File "/usr/lib/python3.4/codecs.py", line 319, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 9: invalid start byte

Run Code Online (Sandbox Code Playgroud)

这是我的代码。（我确信仍然有一些东西在 Python3 中不起作用，因为我一直在尝试适应它并且是新手urllib......）

#!/usr/bin/python
import sys
from urllib.request import urlopen
import json
try:
    filename = sys.argv[1]
except IndexError:
    print('Usage: transcribe.py <file>') …

Run Code Online (Sandbox Code Playgroud)

python audio speech-recognition utf-8 python-3.x

Ing*_*rid

2017 05-23

2
推荐指数

1
解决办法

2337
查看次数

SpeechRecognizer，绑定到识别服务失败

我在 android 上使用 SpeechRecognizer 来识别用户的声音。在卸载 Google App 之前它运行良好。（https://play.google.com/store/apps/details?id=com.google.android.googlequicksearchbox&hl=en）

我更新了 Google App，但出现了“绑定到识别服务失败”等错误。如何使应用程序成功运行？

我该怎么做才能正常使用 SpeechRecognizer？

谢谢。

android speech-recognition bindservice

hob*_*dev

2017 09-03

2
推荐指数

2
解决办法

3759
查看次数

如何更改谷歌语音识别的语言

我的代码：

with sr.Microphone() as source:
    audio = r.listen(source)
    try:
        print("You said: " + r.recognize_google(audio) + "in french")
    except sr.UnknownValueError:
        print("Google Speech Recognition could not understand audio")
    except sr.RequestError as e:
        print("Could not request results from Google Speech Recognition service")

Run Code Online (Sandbox Code Playgroud)

我想将收听语言更改为法语。我该怎么做？

python speech-recognition python-3.x french google-speech-api

Sud*_*dar

2018 04-09

2
推荐指数

1
解决办法

2万
查看次数

Kaldi：qsub 的输出是：qsub：在尝试运行 Common Voice 配方时非法 -c 值“”

我正在尝试kaldi/egs/commonvoice/s5/run.sh在我的计算机上（即，不在集群上）运行 Kaldi 的 Common Voice 配方 ( )。它因错误消息而崩溃Output of qsub was: qsub: illegal -c value ""。可能是什么问题？

具体来说，这里是整个错误堆栈：

[...]
Succeeded in formatting LM: 'data/local/lm.gz'
steps/make_mfcc.sh --cmd queue.pl --mem 2G --nj 20 data/valid_train exp/make_mfcc/valid_train mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/valid_train
steps/make_mfcc.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
queue.pl: Error submitting jobs to queue (return status was 512)
queue log file is exp/make_mfcc/valid_train/q/make_mfcc_valid_train.log, command was qsub -v PATH -cwd -S /bin/bash -j y -l arch=*64* …

Run Code Online (Sandbox Code Playgroud)

speech-recognition qsub kaldi

Fra*_*urt

2018 07-02

2
推荐指数

1
解决办法

1514
查看次数

无法使用语音识别解决丢失的 google-api-python-client 模块

我正在尝试在安装了 Armbian 的 tinkerboard 上运行语音识别。我总是收到这个错误

ERROR - Error fetching results from Speech Recognition service missing google-api-python-client module: ensure that google-api-python-client is set up correctly.

Run Code Online (Sandbox Code Playgroud)

即使我使用 pip list 检查 pip 中已安装的软件包，我也可以看到 google-api-python-client 已安装。

pip 列表输出

cachetools (2.1.0)
certifi (2018.10.15)
chardet (3.0.4)
google-api-python-client (1.7.4)
google-auth (1.5.1)
google-auth-httplib2 (0.0.3)
httplib2 (0.11.3)
idna (2.7)
Mirage (0.9.5.2)
pip (9.0.1)
pyasn1 (0.4.4)
pyasn1-modules (0.2.2)
PyAudio (0.2.11)
pycairo (1.16.2)
requests (2.20.0)
rsa (4.0)
setuptools (40.4.3)
six (1.11.0)
SpeechRecognition (3.8.1)
uritemplate (3.0.0)
urllib3 (1.24)
wheel (0.32.2) …

Run Code Online (Sandbox Code Playgroud)

speech-recognition python-2.7 google-api-python-client google-cloud-speech armbian

Lou*_*oui

2018 10-26

2
推荐指数

1
解决办法

2683
查看次数