标签: google-speech-api

Web语音API当前无法在Chrome /电子/ NW JS中使用？

我一直在使用电子创建一个桌面应用程序，该应用程序使用javascript网络语音api。直到最后几周，它的运行情况都非常好。当前，它不起作用。我尝试使用Nw js，并且还在Chromium浏览器中对其进行了检查，即使在默认的Google语音API演示网站上，它似乎也无法正常工作。

但是它在Google chrome浏览器中可以完美运行。谷歌是否使用API取消了铬？

我还有其他选择可以在桌面应用程序中使用它吗？对于我的应用程序来说，仅使用Web技术是必不可少的，因此我无法查询实际的编程API，需要使用javascript API。

javascript google-chrome chromium webspeech-api google-speech-api

Pav*_*ish

lucky-day

5
推荐指数

0
解决办法

1749
查看次数

如何在android中制作热门词检测服务

我想创建一个应该在后台监听hotword的服务,这样当我打招呼时它应该调用一个活动,我怎么能这样做,关于voiceInteractionService但是我已经读过它不能使用它,是真的吗？有谁能告诉我应该如何解决这个问题？它关于热词检测器

我一直在关注这个

试过这个:

public class InteractionService extends VoiceInteractionService {

static final String TAG = "InteractionService" ;
private AlwaysOnHotwordDetector mHotwordDetector;

@Override
public void onCreate() {
    super.onCreate();

    Log.i(TAG, "service started");
}

@Override
public void onReady() {
    super.onReady();
    Log.i(TAG, "Creating " + this);

    mHotwordDetector = createAlwaysOnHotwordDetector("Hello"
,  Locale.forLanguageTag("en-US"), mHotwordCallback);
    Log.i(TAG, "onReady");
}

private final AlwaysOnHotwordDetector.Callback mHotwordCallback = 
new AlwaysOnHotwordDetector.Callback() {
           @Override
           public void onAvailabilityChanged(int status) {
                    Log.i(TAG, "onAvailabilityChanged(" + status + ")");
                    hotwordAvailabilityChangeHelper(status);
                }

                    @Override
            public void onDetected(AlwaysOnHotwordDetector
               .EventPayload eventPayload) {
                    Log.i(TAG, …

Run Code Online (Sandbox Code Playgroud)

java service android voice-recognition google-speech-api

bla*_*awk

lucky-day

5
推荐指数

1
解决办法

2290
查看次数

OGG_OPUS使用Google Speech API失败,但在同一样本中使用LINEAR16似乎没问题？

将OGG_OPUS提交到谷歌语音API中似乎存在问题,它不会返回任何结果并退出,但转换为LINEAR16时相同的示例工作正常.

使用带有同步提交的标准python库,为每个样本提供以下参数:

sample = speech_client.sample(
    content,
    source_uri=None,
    encoding='LINEAR16',
    sample_rate_hertz=16000)

sample = speech_client.sample(
    content,
    source_uri=None,
    encoding='OGG_OPUS',
    sample_rate_hertz=16000)

Run Code Online (Sandbox Code Playgroud)

样品通过以下方式转换为LINEAR16:

./ffmpeg-git-20170621-64bit-static/ffmpeg -i ./audio.opus -acodec libopus -b:a 16000 -f s16le -acodec pcm_s16le output.raw

Run Code Online (Sandbox Code Playgroud)

原始音频通过Chrome在58的js中通过MediaRecorder录制:https: //developer.mozilla.org/en-US/docs/Web/API/MediaRecorder 就Opus音频而言,使用以下构造函数参数似乎完全没问题:

audioBitsPerSecond=16000
mimeType="audio/webm"

Run Code Online (Sandbox Code Playgroud)

OGG_OPUS返回的错误是:

ValueError: No results returned from the Speech API.

Run Code Online (Sandbox Code Playgroud)

最初我有点困惑,因为OPUS通常将ffprobe注册为48000比特率,但这似乎是由于编解码器默认在48000解码,无论采样率如何.

google-speech-api

Pet*_*zos

lucky-day

5
推荐指数

1
解决办法

512
查看次数

如何在 Python 中使用 recognize_sphinx API 提高语音到文本转换的准确性

如何使用recognize_sphinxPython API提高语音到文本转换的准确性？

请找到下面的代码，需要提高准确率基础！

import speech_recognition as sr

# Obtain path to "english.wav" in the same folder as this script
from os import path
AUDIO_FILE = path.join(path.dirname(path.realpath(file)), "english.wav")
AUDIO_FILE = path.join(path.dirname(path.realpath(file)), "french.aiff")
AUDIO_FILE = path.join(path.dirname(path.realpath(file)), "chinese.flac")

# Use the audio file as the audio source
r = sr.Recognizer()
with sr.AudioFile(AUDIO_FILE) as source:
audio = r.record(source) # Read the entire audio file
# Recognize speech using Sphinx
try:
    print("Sphinx thinks you said " + r.recognize_sphinx(audio))
except sr.UnknownValueError:
    print("Sphinx could not …

Run Code Online (Sandbox Code Playgroud)

python speech-recognition speech-to-text cmusphinx google-speech-api

vin*_*747

2021 10-26

5
推荐指数

1
解决办法

7755
查看次数

谷歌语音到文本的 swift 集成

我正在开发一个应用程序，语音作为输入，必须将文本作为输出，它是一个 iOS 应用程序，之前我通过 Siri 套件开发了该应用程序并实现了它。但问题是我说话时没有得到正确的输出。所以，我需要集成 Google 语音而不是 Siri 套件。我无法在我的 iOS 应用程序中找到任何可以集成到 swift 4 的资源。

SFSpeechRecognizer.requestAuthorization { (authStatus) in
            var isButtonEnabled = false
            switch authStatus {
            case .authorized:
                isButtonEnabled = true
            case .denied:
                isButtonEnabled = false
                print("User denied access to speech recognition")
            case .restricted:
                isButtonEnabled = false
                print("Speech recognition restricted on this device")
            case .notDetermined:
                isButtonEnabled = false
                print("Speech recognition not yet authorized")
            }
            OperationQueue.main.addOperation() {
                // self.microphoneButton.isEnabled = isButtonEnabled
            }
  private let speechRecognizer = SFSpeechRecognizer(locale: Locale.init(identifier: "en-US"))!
    private var recognitionRequest: …

Run Code Online (Sandbox Code Playgroud)

speech-to-text swift google-speech-api

Har*_*tti

lucky-day

5
推荐指数

1
解决办法

1395
查看次数

Google Speech API 单句

Google Speech API 的SingleUtterance工作原理是什么？根据文档，这是谷歌确定说话者何时说出单个话语的方式。我明白它的作用，但我想知道如何？API 是否只是等待一段时间的“无语”音频？如果是这样，无声音频持续多长时间会触发话语结束？

它是否有其他类型的 AI 算法可以帮助确定某人何时停止说话？

谢谢

transcription google-cloud-platform google-speech-api

Har*_*art

2018 09-13

5
推荐指数

1
解决办法

2245
查看次数

如何禁用Google Cloud Speech to Text API的不流畅删除功能

我正在构建一个捕获用户音频的应用程序,并分析读者演讲中的不流畅性,因此了解所有形式的不流畅对我来说很重要.

我注意到Google的语音云API语音会自动消除语音中的不流畅.例如:

"所以呃,我可能会在下周这样做"

获取转录为:

"所以我可能会在下周这样做"

有没有办法保持uhh和umms？

speech-to-text google-speech-api

Asp*_*Mat

lucky-day

5
推荐指数

0
解决办法

83
查看次数

模块“google.cloud.speech_v1p1beta1.types”没有“RecognitionAudio”成员

尝试运行示例代码，但出现此错误”

Module 'google.cloud.speech_v1p1beta1.types' has no 'RecognitionAudio' member

Run Code Online (Sandbox Code Playgroud)

环境：python3x，linux，已安装和更新的 google-cloud lib

pip install --upgrade google-cloud-speech.

Run Code Online (Sandbox Code Playgroud)

安装了以下

谷歌云 (0.34.0)
谷歌云语音（0.36.3）

不知道还有什么要检查的。如果您有任何建议，那就太好了

import argparse
import io

def transcribe_file_with_enhanced_model():
    """Transcribe the given audio file using an enhanced model."""
    # [START speech_transcribe_enhanced_model_beta]
    from google.cloud import speech_v1p1beta1 as speech
    client = speech.SpeechClient()

    speech_file = 'resources/commercial_mono.wav'

    with io.open(speech_file, 'rb') as audio_file:
        content = audio_file.read()

    audio = speech.types.RecognitionAudio(content=content)

    config = speech.types.RecognitionConfig(
        encoding=speech.enums.RecognitionConfig.AudioEncoding.LINEAR16,
        sample_rate_hertz=8000,
        language_code='en-US',
        # Enhanced models are only available to projects that
        # opt in for audio …

Run Code Online (Sandbox Code Playgroud)

speech-recognition speech-to-text google-speech-api

Str*_*ker

lucky-day

5
推荐指数

1
解决办法

891
查看次数