我在 Google Cloud 上启用了一个项目、计费和 Cloud Speech to Text API。然后我下载了一个 .json 文件。然后我尝试在 PyCharm 中执行这个基本代码。
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] ="instant-medium-282.json"
from google.cloud import speech_v1
from google.cloud.speech_v1 import enums
client = speech_v1.SpeechClient()
encoding = enums.RecognitionConfig.AudioEncoding.FLAC
sample_rate_hertz = 44100
language_code = 'en-US'
config = {'encoding': encoding, 'sample_rate_hertz': sample_rate_hertz, 'language_code': language_code}
uri = 'gs://bucket_name/file_name.flac'
audio = {'uri': uri}
response = client.recognize(config, audio)
Run Code Online (Sandbox Code Playgroud)
但是,我不断收到此错误:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/google/api_core/grpc_helpers.py", line 57, in error_remapped_callable
return callable_(*args, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/grpc/_channel.py", line 826, in __call__
return _end_unary_response_blocking(state, call, …Run Code Online (Sandbox Code Playgroud) google-cloud-platform google-cloud-billing google-speech-to-text-api
我正在使用下面的 python 脚本从实时流音频输入中获取来自谷歌语音 API 的预测。
问题是,我需要从谷歌语音 API 对每个话语进行预测,然后还将每个话语的音频保存到磁盘。
我不确定如何修改脚本以保存每个话语的实时音频并打印每个话语的结果而不是连续预测。
#!/usr/bin/env python
import os
import re
import sys
import time
from google.cloud import speech
import pyaudio
from six.moves import queue
# Audio recording parameters
STREAMING_LIMIT = 240000 # 4 minutes
SAMPLE_RATE = 16000
CHUNK_SIZE = int(SAMPLE_RATE / 10) # 100ms
api_key = r'path_to_json_file\google.json'
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = api_key
RED = '\033[0;31m'
GREEN = '\033[0;32m'
YELLOW = '\033[0;33m'
def get_current_time():
"""Return Current Time in MS."""
return int(round(time.time() * 1000))
class ResumableMicrophoneStream:
"""Opens a recording …Run Code Online (Sandbox Code Playgroud) python python-3.x google-cloud-platform google-speech-api google-speech-to-text-api
使用以下代码进行语音识别根本不起作用
with sr.Microphone() as source:
# read the audio data from the default microphone
audio = r.record(source, duration=4)
print("Recognizing...")
# convert speech to text
# recognize speech using Google Speech Recognition
try:
# for testing purposes, we're just using the default API key
# to use another API key, use `r.recognize_google(audio, key="GOOGLE_SPEECH_RECOGNITION_API_KEY")`
# instead of `r.recognize_google(audio)`
print("Google Speech Recognition thinks you said in English: - " + r.recognize_google(audio, language = "en-US"))
except sr.UnknownValueError:
print("Google Speech Recognition could not understand audio") …Run Code Online (Sandbox Code Playgroud) 现在,在谷歌语音到文本API中,我正在做我想做的一切,除了当我想要的时候停止现在我已经尝试在API中调用函数stop()、pause(),其中八个不起作用
const record = require('node-record-lpcm16');// Imports the Google Cloud client library
const speech = require('@google-cloud/speech');// Creates a client
const client = new speech.SpeechClient();const encoding = 'LINEAR16';
const sampleRateHertz = 16000;
const languageCode = 'en-US';const request = {
config: {
encoding: encoding,
sampleRateHertz: sampleRateHertz,
languageCode: languageCode,
},
interimResults: false,
};// Create a recognize stream
const recognizeStream = client
.streamingRecognize(request)
.on('error', console.error)
.on('data', data =>
process.stdout.write(
data.results[0] && data.results[0].alternatives[0]
? `Transcription: ${data.results[0].alternatives[0].transcript}\n`
: `\n\nReached transcription time limit, press Ctrl+C\n`
)
);// Start …Run Code Online (Sandbox Code Playgroud) google-api node.js google-api-nodejs-client electron google-speech-to-text-api
我正在尝试在 Node.js 中设置 StreamingRecognize() Google Cloud Speech to Text V2 以用于流式传输音频数据,并且在初始识别器请求设置流时它总是向我抛出相同的错误:
Error: 3 INVALID_ARGUMENT: Invalid resource field value in the request.
at callErrorFromStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/call.ts:81:17)
at Object.onReceiveStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client.ts:701:51)
at Object.onReceiveStatus (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client-interceptors.ts:416:48)
at /Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/resolving-call.ts:111:24
at processTicksAndRejections (node:internal/process/task_queues:77:11)
for call at
at ServiceClientImpl.makeBidiStreamRequest (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/client.ts:685:42)
at ServiceClientImpl.<anonymous> (/Users/<filtered>/backend/node_modules/@grpc/grpc-js/src/make-client.ts:189:15)
at /Users/<filtered>/backend/node_modules/@google-cloud/speech/build/src/v2/speech_client.js:318:29
at /Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streamingApiCaller.ts:71:19
at /Users/<filtered>/backend/node_modules/google-gax/src/normalCalls/timeout.ts:54:13
at StreamProxy.setStream (/Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streaming.ts:204:20)
at StreamingApiCaller.call (/Users/<filtered>/backend/node_modules/google-gax/src/streamingCalls/streamingApiCaller.ts:88:12)
at /Users/<filtered>/backend/node_modules/google-gax/src/createApiCall.ts:118:26
at processTicksAndRejections (node:internal/process/task_queues:95:5)
{
code: 3,
details: 'Invalid resource field value in the request.',
metadata: Metadata {
internalRepr: Map(2) {
'google.rpc.errorinfo-bin' => …Run Code Online (Sandbox Code Playgroud) streaming node.js google-speech-api google-cloud-speech google-speech-to-text-api
因此,Vosk-api 是一款出色的离线语音识别器,具有出色的支持,但在撰写本文时(2020 年 8 月 14 日)文档非常差(或巧妙隐藏)
问题是:是否有任何形式的谷歌语音识别器功能的替代品,它允许通过语音适应进行额外的转录改进?
例如
"config": {
"encoding":"LINEAR16",
"sampleRateHertz": 8000,
"languageCode":"en-US",
"speechContexts": [{
"phrases": ["weather"]
}]
}
Run Code Online (Sandbox Code Playgroud)
对于谷歌来说,这个配置意味着短语“天气”将具有更高的优先级,比如,听起来是否相同。
还是类令牌?我知道它可能无法在 Vosk 中为 python3 实现,但仍然......
以下是参考资料:
https://cloud.google.com/speech-to-text/docs/class-tokens
https://cloud.google.com/speech-to-text/docs/speech-adaptation
对于 PHP 应用程序中的 Cloud Speech-To-Text 客户端身份验证,我使用以下内容:
$credentials = 'C:\cred.json';
$client=new SpeechClient(['credentials'=>json_decode(file_get_contents($credentials), true)]);
Run Code Online (Sandbox Code Playgroud)
由于某些原因,我收到错误消息:
致命错误:未捕获的 GuzzleHttp\Exception\ClientException:客户端错误:
POST https://oauth2.googleapis.com/token导致400 Bad Request响应:{"error":"invalid_scope","error_description":"提供的 OAuth 范围或 ID 令牌受众无效。"}
上述身份验证方法在 Text-To-Speech API 中完美运行。
$credentials = 'C:\cred.json';
$client = new TextToSpeechClient(['credentials' => json_decode(file_get_contents($credentials), true)]);
Run Code Online (Sandbox Code Playgroud)
有什么问题/缺失?
php google-authentication google-speech-api google-speech-to-text-api
我正在使用 python3 通过提供的 python 包(google-speech)使用 Google 语音转文本转录音频文件。
有一个选项可以定义用于转录的自定义短语,如文档中所述: https: //cloud.google.com/speech-to-text/docs/speech-adaptation
出于测试目的,我使用一个包含文本的小音频文件:
[..] 在本次讲座中,我们将讨论 Burrows Wheeler 变换和 FM 索引 [..]
我将给出以下短语来查看效果,例如,如果我希望使用正确的符号来识别特定名称。在此示例中,我想将burrows更改为barrows:
config = speech.RecognitionConfig(dict(
encoding=speech.RecognitionConfig.AudioEncoding.ENCODING_UNSPECIFIED,
sample_rate_hertz=24000,
language_code="en-US",
enable_word_time_offsets=True,
speech_contexts=[
speech.SpeechContext(dict(
phrases=["barrows", "barrows wheeler", "barrows wheeler transform"]
))
]
))
Run Code Online (Sandbox Code Playgroud)
不幸的是,这似乎没有任何效果,因为输出仍然与没有上下文短语时相同。
我是否使用了错误的短语,或者它有如此高的信心,以至于它听到的单词确实是洞穴,所以它会忽略我的短语?
PS:我还尝试使用speech_v1p1beta1.AdaptationClientandspeech_v1p1beta1.SpeechAdaptation而不是将短语放入配置中,但这只会给我一个内部服务器错误,而不会提供有关出现问题的其他信息。https://cloud.google.com/speech-to-text/docs/adaptation
python speech-to-text google-speech-api google-speech-to-text-api hint-phrases
python ×3
node.js ×2
python-3.x ×2
electron ×1
google-api ×1
hint-phrases ×1
php ×1
speech ×1
streaming ×1
vosk ×1