recognize_google(audio) 过滤掉坏词

Fai*_*yan 2 python speech-recognition

我在使用 google voice_recognition api 时遇到了这个问题。它会自动过滤掉坏词并返回类似“F***”或“P******”的字符串

这是我的代码。我的代码中没有错误,但请帮助我如何从音频中获取原始转换后的文本。

    from gtts import gTTS
    import speech_recognition as sr

    r = sr.Recognizer()

with sr.Microphone() as source:
    print('Ready...')
    r.pause_threshold = 1
    r.adjust_for_ambient_noise(source, duration=1)
    audio = r.listen(source)

    command = r.recognize_google(audio).lower()
    print('You said: ' + command + '\n')
Run Code Online (Sandbox Code Playgroud)

han*_*dle 5

\n

脏话过滤器

\n\n

可选 如果设置为 true,服务器将尝试过滤掉脏话,用星号替换每个过滤单词中除首字符以外的所有字符,例如 \xe2\x80\x9cf***\xe2\x80\x9d。如果设置为 false 或省略,则脏话将不会\xe2\x80\x99 被过滤掉。

\n
\n\n

搜索:\n https://googlecloudplatform.github.io/google-cloud-python/latest/search.html?q=profanity_filter&check_keywords=yes&area=default

\n\n

例子:

\n\n

https://googlecloudplatform.github.io/google-cloud-python/latest/speech/index.html?highlight=profanity_filter#synchronous-recognition

\n\n
\n

使用脏话过滤器的示例。

\n\n
>>> from google.cloud import speech\n>>> client = speech.SpeechClient()\n>>> results = client.recognize(\n...     audio=speech.types.RecognitionAudio(\n...         uri=\'gs://my-bucket/recording.flac\',\n...     ),\n...     config=speech.types.RecognitionConfig(\n...         encoding=\'LINEAR16\',\n...         language_code=\'en-US\',\n...         profanity_filter=True,\n...         sample_rate_hertz=44100,\n...     ),\n... )\n>>> for result in results:\n...     for alternative in result.alternatives:\n...         print(\'=\' * 20)\n...         print(\'transcript: \' + alternative.transcript)\n...         print(\'confidence: \' + str(alternative.confidence))\n====================\ntranscript: Hello, this is a f****** test\nconfidence: 0.81\n
Run Code Online (Sandbox Code Playgroud)\n
\n\n

很好的例子;-)

\n\n

(我没有测试过这个)

\n