标签: google-speech-api

（Android Studio语音识别器）即使我给了它RECORD_AUDIO和INTERNET，我仍然收到错误9（权限不足）

package blessupboys.speechtest;
import android.app.Activity;
import android.content.Context;
import android.content.Intent;
import android.net.ConnectivityManager;
import android.os.Bundle;
import android.view.View;
import android.view.View.OnClickListener;
import android.speech.RecognitionListener;
import android.speech.RecognizerIntent;
import android.speech.SpeechRecognizer;
import android.widget.Button;
import android.widget.TextView;
import java.util.ArrayList;
import android.util.Log;



public class VoiceRecognitionTest extends Activity implements OnClickListener
{

    private TextView mText;
    private SpeechRecognizer sr;
    private static final String TAG = "MyStt3Activity";
    @Override
    public void onCreate(Bundle savedInstanceState)
    {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_voice_recognition_test);
        Button speakButton = (Button) findViewById(R.id.btn_speak);
        mText = (TextView) findViewById(R.id.textView1);
        speakButton.setOnClickListener(this);
        sr = SpeechRecognizer.createSpeechRecognizer(this);
        sr.setRecognitionListener(new listener());
    }

    class listener implements RecognitionListener
    { …

Run Code Online (Sandbox Code Playgroud)

android speech-recognition android-studio google-speech-api nsspeechrecognizer

Rob*_*ngh

lucky-day

5
推荐指数

3
解决办法

4504
查看次数

是否无法使用curl来使用Google Cloud Speech API识别10到15分钟的文件？

我正在使用REST API与cURL,因为我需要做一些快速而简单的事情,而且我在一个盒子里,我无法开始倾倒垃圾; 即一些厚的开发人员SDK.

我开始base64编码flac文件和启动speech.syncrecognize.

最终失败了:

{
  "error": {
    "code": 400,
    "message": "Request payload size exceeds the limit: 10485760.",
    "status": "INVALID_ARGUMENT"
  }
}

Run Code Online (Sandbox Code Playgroud)

好的,你不能在请求中发送31,284,578字节; 必须使用云存储.所以,我上传了flac音频文件,然后再使用云存储中的文件重试.那失败了:

{
  "error": {
    "code": 400,
    "message": "For audio inputs longer than 1 min, use the 'AsyncRecognize' method.",
    "status": "INVALID_ARGUMENT"
  }
}

Run Code Online (Sandbox Code Playgroud)

太棒了,speech.syncrecognize不喜欢内容大小; 再试一次speech.asyncrecognize.那失败了:

{
  "error": {
    "code": 400,
    "message": "For audio inputs longer than 1 min, please use LINEAR16 encoding.",
    "status": "INVALID_ARGUMENT"
  }
}

Run Code Online (Sandbox Code Playgroud)

好的,所以speech.asyncrecognize …

rest curl speech-recognition google-speech-api

tlu*_*lum

2016 08-01

5
推荐指数

1
解决办法

2368
查看次数

可以手动向SpeechClient提供GoogleCredential(在.NET API中)吗？

我发现的所有SpeechClient文档都涉及在下载SDK后运行命令行,或者笨拙地设置"GOOGLE_APPLICATION_CREDENTIALS"环境变量以指向本地凭据文件.

我讨厌环境变量方法,而是想要一个从应用程序根目录加载共享的,源代码控制的开发帐户文件的解决方案.像这样的东西:

var credential = GoogleCredential.FromStream(/*load shared file from app root*/);
var client = SpeechClient.Create(/*I wish I could pass credential in here*/);

Run Code Online (Sandbox Code Playgroud)

有没有办法做到这一点,以便我不必依赖环境变量？

.net c# google-cloud-platform google-speech-api

Col*_*lin

lucky-day

5
推荐指数

1
解决办法

3734
查看次数

使用OGG_OPUS的Google语音转文本

谷歌语音文本API是否真的支持OGG_OPUS编解码器（用于流音频）？我没有运气去工作。如果有人能够使它正常工作，您能否分享任何代码段/建议？基本上，Google API不会返回任何转录结果。SPEEX编解码器存在相同问题...

google-speech-api

Fit*_*its

lucky-day

5
推荐指数

0
解决办法

299
查看次数

如何从音频输入中识别多个扬声器及其文本？

我正在使用微软的认知服务.我有一个音频输入,需要识别多个扬声器及其单独的文本.

根据我的理解,Speaker Rekognition API可以识别不同的个人,Bing Speech API可以将语音转换为文本.但是,要同时执行这两项操作,我需要手动将音频文件拆分为多个部分(基于暂停/静音),然后将音频流发送到各个服务.有没有更好的方法呢？我应该改用AWS Lex/Polly或Google产品的任何其他生态系统吗？

speech-recognition ibm-watson microsoft-cognitive google-speech-api dialogflow-es

bla*_*cer

lucky-day

4
推荐指数

1
解决办法

5060
查看次数

使 Python 语音识别速度更快

我一直在使用 Python 的谷歌语音识别。这是我的代码：

import speech_recognition as sr
r = sr.Recognizer()
with sr.Microphone() as source:
   print("Say something!")
   audio = r.listen(source)
   print(r.recognize_google(audio))

Run Code Online (Sandbox Code Playgroud)

虽然识别非常准确，但大约需要4-5秒才能吐出识别出的文字。由于我正在创建一个语音助手，因此我想修改上面的代码以使语音识别速度更快。

有什么办法可以将这个数字降低到大约 1-2 秒吗？如果可能的话，我会尝试像 Siri 和 Ok Google 等服务一样快速地进行识别。

我对 python 很陌生，所以如果我的问题有一个简单的答案，我很抱歉。

python speech-recognition dictation google-speech-api

Man*_*kla

lucky-day

4
推荐指数

1
解决办法

2万
查看次数

Am trying to call the speech-to-text api of google console for nodejs and apparently is working, but when i try to run the example provider for here the example node MicrophoneStream.js and doing the installation like this here, am having the following error.

STDERR: sox FAIL sox: Sorry, there is no default audio device configured

i dont really know how to pass the device with arguments and i assuming is the default microphone but not sure cause in some …

windows speech-to-text sox node.js google-speech-api

the*_*shy

2019 07-26

4
推荐指数

1
解决办法

7978
查看次数

将麦克风录音机从浏览器发送到谷歌语音到文本 - Javascript

将麦克风录音机从浏览器发送到谷歌语音到文本。不需要流和套接字，也不需要通过 Node.js 到 Google 服务器的 HTTP 请求，也不需要通过来自客户端（浏览器）端的 HTTP 请求。

我面临的问题：

完成了客户端实现，以及服务器端实现。两种实现相互独立地工作。我正在从麦克风获取音频数据，并且能够播放它，并且能够使用 Google 提供的 audio.raw 示例测试服务器端实现。

但是，当我尝试将麦克风数据从浏览器发送到我的节点服务器，然后再发送到 Google 服务器时，我收到编码问题：“从 Google 服务器获取空响应”。

我的问题是如何更改音频文件的编码，然后使用 Javascript 将其发送到 Google Speech to Text 服务器。

javascript node.js google-speech-api

Vno*_*mar

2019 08-16

4
推荐指数

2
解决办法

3139
查看次数

使用 Firestore 和 Google Speech to Text 时出现重复类错误

当尝试使用 Firestore 和 Google Speech to Text 库构建项目时，我收到“重复类”错误。据我了解，这是由于两个库都引入了原型库。排除会产生运行时错误。如何解决重复问题？

\n\n

这是错误（连接起来，还有数百行）：

\n\n

Duplicate class com.google.api.Advice found in modules jetified-proto-google-common-protos-1.17.0.jar (com.google.api.grpc:proto-google-common-protos:1.17.0) and jetified-protolite-well-known-types-17.0.0-runtime.jar (com.google.firebase:protolite-well-known-types:17.0.0)\nDuplicate class com.google.api.Advice$1 found in modules jetified-proto-google-common-protos-1.17.0.jar (com.google.api.grpc:proto-google-common-protos:1.17.0) and jetified-protolite-well-known-types-17.0.0-runtime.jar (com.google.firebase:protolite-well-known-types:17.0.0)\nDuplicate class com.google.api.Advice$Builder found in modules jetified-proto-google-common-protos-1.17.0.jar (com.google.api.grpc:proto-google-common-protos:1.17.0) and jetified-protolite-well-known-types-17.0.0-runtime.jar (com.google.firebase:protolite-well-known-types:17.0.0)\n

Run Code Online (Sandbox Code Playgroud)\n\n

这是我的依赖项：

\n\n

dependencies {\n    implementation fileTree(dir: \'libs\', include: [\'*.jar\'])\n    implementation "org.jetbrains.kotlin:kotlin-stdlib-jdk7:$kotlin_version"\n    implementation \'androidx.appcompat:appcompat:1.1.0\'\n    implementation \'androidx.core:core-ktx:1.2.0\'\n    implementation \'androidx.constraintlayout:constraintlayout:1.1.3\'\n    implementation \'androidx.recyclerview:recyclerview:1.1.0\'\n    implementation \'com.google.code.gson:gson:2.8.6\'\n    implementation \'com.google.firebase:firebase-analytics:17.4.1\'\n    implementation \'com.firebaseui:firebase-ui-auth:6.2.0\'\n    implementation \'com.google.firebase:firebase-firestore:21.4.3\'\n\n    // add these dependencies for the speech client\n    implementation \'io.grpc:grpc-okhttp:1.29.0\'\n …

Run Code Online (Sandbox Code Playgroud)

android build.gradle google-speech-api google-cloud-firestore

Jos*_*ano

2020 05-18

4
推荐指数

1
解决办法

2521
查看次数