标签: google-speech-api

Google Speech API 的 Base64 解码失败

我尝试使用 JSON 和下面的代码片段向https://speech.googleapis.com/v1/speech:recognize发送 POST 请求。不知何故，谷歌回应说无法在我的请求中解码 Base 64。

{ "config": { "encoding": "LINEAR16", "sampleRateHertz": 16000, "languageCode": "ja-JP", "maxAlternatives": 5, "profanityFilter": false }, "audio": { "content ": "ZXCVBNM" }, }

    String pcmFilePath = "/storage/emulated/0/Download/voice8K16bitmono.pcm";
    File rawFile = new File(pcmFilePath);
    byte[] rawData = new byte[(int) rawFile.length()];
    DataInputStream input = null;
    try {
        input = new DataInputStream(new FileInputStream(rawFile));
        int readResult = input.read(rawData);
    } catch (Exception ex) {
        ex.printStackTrace();
    }
    if (input != null) {
        input.close();
    };

    String base64 = Base64.encodeToString(rawData, Base64.DEFAULT);
    String …

Run Code Online (Sandbox Code Playgroud)

android google-apis-explorer google-speech-api

Zod*_*123

lucky-day

2
推荐指数

1
解决办法

4185
查看次数

如何更改谷歌语音识别的语言

我的代码：

with sr.Microphone() as source:
    audio = r.listen(source)
    try:
        print("You said: " + r.recognize_google(audio) + "in french")
    except sr.UnknownValueError:
        print("Google Speech Recognition could not understand audio")
    except sr.RequestError as e:
        print("Could not request results from Google Speech Recognition service")

Run Code Online (Sandbox Code Playgroud)

我想将收听语言更改为法语。我该怎么做？

python speech-recognition python-3.x french google-speech-api

Sud*_*dar

2018 04-09

2
推荐指数

1
解决办法

2万
查看次数

Google Speech API"请求中的采样率与FLAC标头不匹配"

我正在尝试将mp4视频片段转换为FLAC音频文件,然后谷歌语音吐出视频中的单词,以便我可以检测是否有特定的单词.

除了我从Speech API收到错误之外,我已经完成了所有工作:

{
  "error": {
    "code": 400,
    "message": "Sample rate in request does not match FLAC header.",
    "status": "INVALID_ARGUMENT"
  }
}

Run Code Online (Sandbox Code Playgroud)

我正在使用FFMPEG将mp4转换为FLAC文件.我在命令中指定FLAC文件为16位,但是当我右键单击FLAC文件时,Windows告诉我它是302kbps.

这是我的PHP代码:

// convert mp4 video to 16 bit flac audio file
$cmd = 'C:/wamp/www/ffmpeg/bin/ffmpeg.exe -i C:/wamp/www/test.mp4 -c:a flac -sample_fmt s16 C:/wamp/www/test.flac';
exec($cmd, $output);

// convert flac to text so we can detect if certain words were said
$data = array(
    "config" => array(
        "encoding" => "FLAC",
        "sampleRate" => 16000,
        "languageCode" => "en-US"
    ),
    "audio" => array(
        "content" …

Run Code Online (Sandbox Code Playgroud)

php ffmpeg flac google-speech-api

kjd*_*n84

lucky-day

1
推荐指数

1
解决办法

1455
查看次数

Google Cloud语音转文本中的enable_speaker_diarization标签错误

使用Google语音转文本，我可以使用默认参数转录音频剪辑。但是，在使用enable_speaker_diarization标签在音频片段中配置单个扬声器时，出现错误消息。谷歌文档就在这里这是一个漫长的识别音频剪辑所以我使用的谷歌建议异步请求，在这里

我的代码-

def transcribe_gcs(gcs_uri):
from google.cloud import speech
from google.cloud import speech_v1 as speech
from google.cloud.speech import enums
from google.cloud.speech import types
client = speech.SpeechClient()
audio = types.RecognitionAudio(uri = gcs_uri)
config = speech.types.RecognitionConfig(encoding=speech.enums.RecognitionConfig.AudioEncoding.FLAC, 
                                        sample_rate_hertz= 16000, 
                                        language_code = 'en-US',
                                       enable_speaker_diarization=True,
                                        diarization_speaker_count=2)

operation = client.long_running_recognize(config, audio)
print('Waiting for operation to complete...')
response = operation.result(timeout=3000)
result = response.results[-1]

words_info = result.alternatives[0].words

for word_info in words_info:
    print("word: '{}', speaker_tag: {}".format(word_info.word, word_info.speaker_tag))

Run Code Online (Sandbox Code Playgroud)

使用后-

transcribe_gcs('gs://bucket_name/filename.flac')

Run Code Online (Sandbox Code Playgroud)

我得到错误

ValueError: Protocol message RecognitionConfig has no "enable_speaker_diarization" …

Run Code Online (Sandbox Code Playgroud)

speech-to-text python-3.x google-cloud-platform google-speech-api google-cloud-speech

Asi*_*ikh

2019 01-20

1
推荐指数

1
解决办法

1007
查看次数

类 google.cloud.speech.v1.LongRunningRecognizeMetadata 尚未添加到描述符池

尝试恢复 Speech to Text 操作时出现此错误。

Google\Protobuf\Internal\GPBDecodeException: Error occurred during parsing: Class google.cloud.speech.v1.LongRunningRecognizeMetadata hasn't been added to descriptor pool in Google\Protobuf\Internal\Message->parseFromJsonStream()

我正在做的是启动长时间运行的操作并存储名称。稍后我将根据我之前存储的名称创建一个单独的页面，其中包含操作的状态。

这是我用来尝试获取操作状态的方法

$speechClient = new SpeechClient();
$operationResponse = $speechClient->resumeOperation($record->operation_name, 'longRunningRecognize');

Run Code Online (Sandbox Code Playgroud)

有可能做这样的事情吗？

php grpc google-speech-api google-cloud-speech

Pau*_*ake

lucky-day

1
推荐指数

1
解决办法

241
查看次数

如何给出听python语音识别库的持续时间？

我正在研究 RPi 4 并让代码正常工作，但是我的语音识别对象从麦克风的收听时间非常长，几乎有 10 秒。这次我想减少。我查看了语音识别库文档，但它没有在任何地方提到该功能。

Python 编辑器给我该函数的以下提示listen()；

\n
self：识别器，源，超时=无，phrase_time_limit =无\nsnowboy_configuration =无
\n

所以我尝试调用该函数，如下所示：

audio = r.listen(source,None,3)\n

Run Code Online (Sandbox Code Playgroud)\n

或者

 audio = r.listen(source,3,3)\n

Run Code Online (Sandbox Code Playgroud)\n

希望它能听 3 秒钟，但事实并非如此。

以下是我的代码：

import speech_recognition as sr\n\nr = sr.Recognizer()\n\nspeech = sr.Microphone(2)\n\n#print(sr.Microphone.list_microphone_names())\n\nwhile 1:\n\n             with speech as source:\n                 print("say something!\xe2\x80\xa6")\n                 audio = r.adjust_for_ambient_noise(source)\n                 audio = r.listen(source,None,3)\n                 print("the audio has been recorded")\n             # Speech recognition using Google Speech Recognition\n             try:\n                 print("api is enabled")\n                 recog = r.recognize_google(audio, language = …

Run Code Online (Sandbox Code Playgroud)

python speech-recognition google-speech-api

The*_*oud

lucky-day

1
推荐指数

1
解决办法

8712
查看次数

Google Speech-To-Text 会忽略自定义短语/单词

我正在使用 python3 通过提供的 python 包（google-speech）使用 Google 语音转文本转录音频文件。

有一个选项可以定义用于转录的自定义短语，如文档中所述： https: //cloud.google.com/speech-to-text/docs/speech-adaptation

出于测试目的，我使用一个包含文本的小音频文件：

[..] 在本次讲座中，我们将讨论 Burrows Wheeler 变换和 FM 索引 [..]

我将给出以下短语来查看效果，例如，如果我希望使用正确的符号来识别特定名称。在此示例中，我想将burrows更改为barrows：

config = speech.RecognitionConfig(dict(
    encoding=speech.RecognitionConfig.AudioEncoding.ENCODING_UNSPECIFIED,
    sample_rate_hertz=24000,
    language_code="en-US",
    enable_word_time_offsets=True,
    speech_contexts=[
        speech.SpeechContext(dict(
            phrases=["barrows", "barrows wheeler", "barrows wheeler transform"]
        ))
    ]
))

Run Code Online (Sandbox Code Playgroud)

不幸的是，这似乎没有任何效果，因为输出仍然与没有上下文短语时相同。

我是否使用了错误的短语，或者它有如此高的信心，以至于它听到的单词确实是洞穴，所以它会忽略我的短语？

PS：我还尝试使用speech_v1p1beta1.AdaptationClientandspeech_v1p1beta1.SpeechAdaptation而不是将短语放入配置中，但这只会给我一个内部服务器错误，而不会提供有关出现问题的其他信息。https://cloud.google.com/speech-to-text/docs/adaptation

python speech-to-text google-speech-api google-speech-to-text-api hint-phrases

sam*_*sam

2022 09-24

1
推荐指数

1
解决办法

1815
查看次数

错误：（gcloud.auth.activate-service-account）无法激活给定的服务帐户。请确保提供的密钥文件有效

我正在尝试遵循本指南https://cloud.google.com/speech/docs/getting-started，以通过curl调用GAE语音至文本api。但这似乎不起作用。

我已经建立了一个项目并启用了语音对文本api的功能。但是，当我尝试激活服务帐户时，它失败了。我已经运行了诊断程序，尝试了其他帐户，验证了json文件（已发送电子邮件），并尝试了gcloud beta init ：-（

bash-3.2 $ gcloud auth激活服务帐户account@project.iam.gserviceaccount.com --key-file = project.json错误：（gcloud.auth.activate-service-account）无法激活给定的服务帐户。请确保提供的密钥文件有效。

尽管“ gcloud auth print-access-token”的下一步返回一个令牌。

但是最后一步（curl）返回此信息-

{“错误”：{“代码”：403，“消息”：“ Google Cloud Speech API之前未在项目google.com:cloudsdktool中使用或已禁用。请访问https：//console.developers启用它。 google.com/apis/api/speech.googleapis.com/overview?project=google.com:cloudsdktool然后重试。如果您最近启用了此API，请等待几分钟，以将操作传播到我们的系统并重试。“， “状态”：“ PERMISSION_DENIED”，“详细信息”：[{“ @type”：“ type.googleapis.com/google.rpc.Help”，“链接”：[{“描述”：“ Google开发人员控制台API激活” ，“ url”：“ https://console.developers.google。com / apis / api / speech.googleapis.com / overview？project = google.com：cloudsdktool “}]}]}}

问题似乎出在用于验证传入请求的项目（google.com:cloudsdktool而不是我的）中。

我猜想激活服务帐户的电话是造成此问题的原因？

authentication gcloud google-speech-api

use*_*373

lucky-day

0
推荐指数

1
解决办法

4485
查看次数