Viv*_*and 15 google-api google-speech-api
堆栈溢出可能不是问这个问题的最佳位置,但我需要帮助.我有一个mp3文件,我想使用谷歌的语音识别来获取该文件的文本.任何我可以找到文档或示例的想法将不胜感激.
A S*_*ANI 41
查看Google Cloud Speech API,它可以让开发人员将音频转换为文本[...] API可以识别80多种语言和变体[...]您可以创建一个免费帐户来获取有限数量的API请求.
如何:
首先需要安装gcloud python模块和google-api-python-client模块:
pip install --upgrade gcloud
pip install --upgrade google-api-python-client
Run Code Online (Sandbox Code Playgroud)
然后在Cloud Platform Console中,转到"项目"页面并选择或创建新项目.在您需要为项目启用结算后,请启用Cloud Speech API.
启用Google Cloud Speech API后,单击"转到凭据"按钮以设置Cloud Speech API凭据
有关如何从代码中授权Cloud Speech API服务的信息,请参阅设置服务帐户
您应该同时获得服务帐户密钥文件(使用JSON)和GOOGLE_APPLICATION_CREDENTIALS环境变量,以便您对Speech API进行身份验证
一旦全部完成,下载音频原始文件从谷歌,也是语音discovery_google_rest_v1.json从谷歌
修改以前下载的JSON文件以设置您的凭据密钥,然后确保已将GOOGLE_APPLICATION_CREDENTIALS环境变量设置为.json文件的完整路径,其中包含:
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service_account_file.json
Run Code Online (Sandbox Code Playgroud)
也
确保您已将GCLOUD_PROJECT环境变量设置为Google Cloud项目的ID,其中包括:
export GCLOUD_PROJECT=your-project-id
Run Code Online (Sandbox Code Playgroud)
假设全部完成,您可以创建一个tutorial.py文件,其中包含:
import argparse
import base64
import json
from googleapiclient import discovery
import httplib2
from oauth2client.client import GoogleCredentials
DISCOVERY_URL = ('https://{api}.googleapis.com/$discovery/rest?'
'version={apiVersion}')
def get_speech_service():
credentials = GoogleCredentials.get_application_default().create_scoped(
['https://www.googleapis.com/auth/cloud-platform'])
http = httplib2.Http()
credentials.authorize(http)
return discovery.build(
'speech', 'v1beta1', http=http, discoveryServiceUrl=DISCOVERY_URL)
def main(speech_file):
"""Transcribe the given audio file.
Args:
speech_file: the name of the audio file.
"""
with open(speech_file, 'rb') as speech:
speech_content = base64.b64encode(speech.read())
service = get_speech_service()
service_request = service.speech().syncrecognize(
body={
'config': {
'encoding': 'LINEAR16', # raw 16-bit signed LE samples
'sampleRate': 16000, # 16 khz
'languageCode': 'en-US', # a BCP-47 language tag
},
'audio': {
'content': speech_content.decode('UTF-8')
}
})
response = service_request.execute()
print(json.dumps(response))
if __name__ == '__main__':
parser = argparse.ArgumentParser()
parser.add_argument(
'speech_file', help='Full path of audio file to be recognized')
args = parser.parse_args()
main(args.speech_file)
Run Code Online (Sandbox Code Playgroud)
然后运行:
python tutorial.py audio.raw
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
21799 次 |
| 最近记录: |