使用Python将云语音API的结果导出到JSON文件

Ale*_*sal 5 python json dictionary google-speech-api

我正在尝试通过 Google Speech API 将音频文件转换为印度语言的文本。API 返回类型为“google.cloud.speech_v1.types.SpeechRecognitionAlternative”的对象。我正在尝试将结果导出到 .json 文件。我对python很陌生。这是我在 python 中做的第一个项目。

    import io
    import os
    import pickle

    # Imports the Google Cloud client library
    from google.cloud
    import speech
    from google.cloud.speech
    import enums
    from google.cloud.speech
    import types

    client = speech.SpeechClient()

    audio = types.RecognitionAudio(uri = "gs://storage-staples-canada/client-data/TapTapTap.wav")
    config = types.RecognitionConfig(
      encoding = enums.RecognitionConfig.AudioEncoding.LINEAR16,
      sample_rate_hertz = 16000,
      language_code = 'hi-IN',
      enable_word_time_offsets = True,
      speech_contexts = [speech.types.SpeechContext(phrases = ['?? ???? ???', '??? ??? ?????', '?? ??? ????? ???? ?????', '???? ??? ??? ????? ??? ??? ???? ????? ???? ??????', '??? ???? ?? ?? ?? ??? ?? ???? ???', '???? ???? ?? ???? ???? ??? ???? ??? ????? ?? ?? ??? ???', '????? ?? ???- ??? ??????, ?? ??- ??- ??? ?? ??????? ???? ????- ????? ????? ???!', '????? ????- ??? ??? ??????? ?????? ????- ??? ??? ????', '?????? ?? ???- ??? ?? ???? ???? ??????? ?? ??- ??- ??? ?? ?????? ????', '??? ???? ????, ????? ?? ???? ?????? ?? ??? ??? ??', '??- ??- ????'])], )

    operation = client.long_running_recognize(config, audio)
    print('Waiting for operation to complete...')
    response = operation.result(timeout = 90)

    # Gets the time - offsets of each of the words in the audio

    for result in response.results:
      alternative = result.alternatives[0]# The first alternative is the most likely one
    for this portion.
    print('Transcript: {}'.format(result.alternatives[0].transcript))
    print('Confidence: {}'.format(result.alternatives[0].confidence))
    for word_info in alternative.words:
      word = word_info.word
    start_time = word_info.start_time
    end_time = word_info.end_time
    print('Word: {}, start_time: {}, end_time: {}'.format(
      word,
      start_time.seconds + start_time.nanos * 1e-9,
      end_time.seconds + end_time.nanos * 1e-9))
Run Code Online (Sandbox Code Playgroud)

当我尝试将 API 的结果(存储在上面代码中的响应变量中)转换为字典时。我收到的错误消息是“TypeError: 'SpeechRecognitionAlternative' object is not iterable”。你能帮我将结果转换并导出到 .json 文件吗?

小智 6

我建议使用谷歌 protobuf 库中的 protobuf?json 转换器:

from google.protobuf.json_format import MessageToJson

# the below line is taken from the code above, which contains the google api results
response = operation.result(timeout = 90)
result_json = MessageToJson(response)
Run Code Online (Sandbox Code Playgroud)

然后用json.dump写result_json。请参阅如何将 JSON 数据写入文件?


che*_*rba 0

对于此任务,您可以使用命令行工具gcloud。例如

\n\n
gcloud ml speech recognize-long-running \\\n  gs://storage-staples-canada/client-data/TapTapTap.wav \\\n  --language-code=hi-IN --encoding=linear16 --sample-rate=16000 \\\n  --include-word-time-offsets \\\n  --hints="\xe0\xa4\x8f\xe0\xa4\x95 \xe0\xa4\x9c\xe0\xa4\x82\xe0\xa4\x97\xe0\xa4\xb2 \xe0\xa4\xa5\xe0\xa4\xbe\xe0\xa5\xa4,\xe0\xa5\x99\xe0\xa5\x82\xe0\xa4\xac \xe0\xa4\x98\xe0\xa4\xa8\xe0\xa4\xbe \xe0\xa4\x9c\xe0\xa4\x82\xe0\xa4\x97\xe0\xa4\xb2\xe0\xa5\xa4"\n  --format=json\n
Run Code Online (Sandbox Code Playgroud)\n\n

您可以添加--log-http标志来查看 API 交互,这可以帮助您修复 python 代码。

\n