Ale*_*sal 5 python json dictionary google-speech-api
我正在尝试通过 Google Speech API 将音频文件转换为印度语言的文本。API 返回类型为“google.cloud.speech_v1.types.SpeechRecognitionAlternative”的对象。我正在尝试将结果导出到 .json 文件。我对python很陌生。这是我在 python 中做的第一个项目。
import io
import os
import pickle
# Imports the Google Cloud client library
from google.cloud
import speech
from google.cloud.speech
import enums
from google.cloud.speech
import types
client = speech.SpeechClient()
audio = types.RecognitionAudio(uri = "gs://storage-staples-canada/client-data/TapTapTap.wav")
config = types.RecognitionConfig(
encoding = enums.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz = 16000,
language_code = 'hi-IN',
enable_word_time_offsets = True,
speech_contexts = [speech.types.SpeechContext(phrases = ['?? ???? ???', '??? ??? ?????', '?? ??? ????? ???? ?????', '???? ??? ??? ????? ??? ??? ???? ????? ???? ??????', '??? ???? ?? ?? ?? ??? ?? ???? ???', '???? ???? ?? ???? ???? ??? ???? ??? ????? ?? ?? ??? ???', '????? ?? ???- ??? ??????, ?? ??- ??- ??? ?? ??????? ???? ????- ????? ????? ???!', '????? ????- ??? ??? ??????? ?????? ????- ??? ??? ????', '?????? ?? ???- ??? ?? ???? ???? ??????? ?? ??- ??- ??? ?? ?????? ????', '??? ???? ????, ????? ?? ???? ?????? ?? ??? ??? ??', '??- ??- ????'])], )
operation = client.long_running_recognize(config, audio)
print('Waiting for operation to complete...')
response = operation.result(timeout = 90)
# Gets the time - offsets of each of the words in the audio
for result in response.results:
alternative = result.alternatives[0]# The first alternative is the most likely one
for this portion.
print('Transcript: {}'.format(result.alternatives[0].transcript))
print('Confidence: {}'.format(result.alternatives[0].confidence))
for word_info in alternative.words:
word = word_info.word
start_time = word_info.start_time
end_time = word_info.end_time
print('Word: {}, start_time: {}, end_time: {}'.format(
word,
start_time.seconds + start_time.nanos * 1e-9,
end_time.seconds + end_time.nanos * 1e-9))
Run Code Online (Sandbox Code Playgroud)
当我尝试将 API 的结果(存储在上面代码中的响应变量中)转换为字典时。我收到的错误消息是“TypeError: 'SpeechRecognitionAlternative' object is not iterable”。你能帮我将结果转换并导出到 .json 文件吗?
小智 6
我建议使用谷歌 protobuf 库中的 protobuf?json 转换器:
from google.protobuf.json_format import MessageToJson
# the below line is taken from the code above, which contains the google api results
response = operation.result(timeout = 90)
result_json = MessageToJson(response)
Run Code Online (Sandbox Code Playgroud)
然后用json.dump写result_json。请参阅如何将 JSON 数据写入文件?
对于此任务,您可以使用命令行工具gcloud。例如
\n\ngcloud ml speech recognize-long-running \\\n gs://storage-staples-canada/client-data/TapTapTap.wav \\\n --language-code=hi-IN --encoding=linear16 --sample-rate=16000 \\\n --include-word-time-offsets \\\n --hints="\xe0\xa4\x8f\xe0\xa4\x95 \xe0\xa4\x9c\xe0\xa4\x82\xe0\xa4\x97\xe0\xa4\xb2 \xe0\xa4\xa5\xe0\xa4\xbe\xe0\xa5\xa4,\xe0\xa5\x99\xe0\xa5\x82\xe0\xa4\xac \xe0\xa4\x98\xe0\xa4\xa8\xe0\xa4\xbe \xe0\xa4\x9c\xe0\xa4\x82\xe0\xa4\x97\xe0\xa4\xb2\xe0\xa5\xa4"\n --format=json\nRun Code Online (Sandbox Code Playgroud)\n\n您可以添加--log-http标志来查看 API 交互,这可以帮助您修复 python 代码。