recognize_google(audio) 过滤掉坏词

Question

recognize_google(audio) 过滤掉坏词

我在使用 google voice_recognition api 时遇到了这个问题。它会自动过滤掉坏词并返回类似“F***”或“P******”的字符串

这是我的代码。我的代码中没有错误，但请帮助我如何从音频中获取原始转换后的文本。

    from gtts import gTTS
    import speech_recognition as sr

    r = sr.Recognizer()

with sr.Microphone() as source:
    print('Ready...')
    r.pause_threshold = 1
    r.adjust_for_ambient_noise(source, duration=1)
    audio = r.listen(source)

    command = r.recognize_google(audio).lower()
    print('You said: ' + command + '\n')

Run Code Online (Sandbox Code Playgroud)

Answer 1

han*_*dle 5

\n
脏话过滤器
\n\n
可选如果设置为 true，服务器将尝试过滤掉脏话，用星号替换每个过滤单词中除首字符以外的所有字符，例如 \xe2\x80\x9cf***\xe2\x80\x9d。如果设置为 false 或省略，则脏话将不会\xe2\x80\x99 被过滤掉。
\n

\n\n

搜索：\n https://googlecloudplatform.github.io/google-cloud-python/latest/search.html?q=profanity_filter&check_keywords=yes&area=default

\n\n

例子：

\n\n

https://googlecloudplatform.github.io/google-cloud-python/latest/speech/index.html?highlight=profanity_filter#synchronous-recognition

\n\n

\n

使用脏话过滤器的示例。

\n\n

>>> from google.cloud import speech\n>>> client = speech.SpeechClient()\n>>> results = client.recognize(\n...     audio=speech.types.RecognitionAudio(\n...         uri=\'gs://my-bucket/recording.flac\',\n...     ),\n...     config=speech.types.RecognitionConfig(\n...         encoding=\'LINEAR16\',\n...         language_code=\'en-US\',\n...         profanity_filter=True,\n...         sample_rate_hertz=44100,\n...     ),\n... )\n>>> for result in results:\n...     for alternative in result.alternatives:\n...         print(\'=\' * 20)\n...         print(\'transcript: \' + alternative.transcript)\n...         print(\'confidence: \' + str(alternative.confidence))\n====================\ntranscript: Hello, this is a f****** test\nconfidence: 0.81\n

Run Code Online (Sandbox Code Playgroud)\n

\n\n

很好的例子;-)

\n\n

（我没有测试过这个）

\n

归档时间：	7 年，4 月前
查看次数：	2585 次
最近记录：	7 年，4 月前