标签: azure-speech

如何将 Azure 连续语音识别结果保存在变量中？

我正在尝试将 Azure 连续语音识别用于语音转文本项目。以下是 Azure 提供的示例代码：

def speech_recognize_continuous_from_file():
    """performs continuous speech recognition with input from an audio file"""
    # <SpeechContinuousRecognitionWithFile>
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
    audio_config = speechsdk.audio.AudioConfig(filename=weatherfilename)

    speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

    done = False

    def stop_cb(evt):
        """callback that signals to stop continuous recognition upon receiving an event `evt`"""
        print('CLOSING on {}'.format(evt))
        nonlocal done
        done = True

    # Connect callbacks to the events fired by the speech recognizer
    speech_recognizer.recognizing.connect(lambda evt: print('RECOGNIZING: {}'.format(evt)))
    speech_recognizer.recognized.connect(lambda evt: print('RECOGNIZED: {}'.format(evt)))
    speech_recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
    speech_recognizer.session_stopped.connect(lambda …

Run Code Online (Sandbox Code Playgroud)

azure azure-cognitive-services azure-speech

Arm*_*shi

lucky-day

5
推荐指数

1
解决办法

1011
查看次数

二进制文件转base64 nodejs

当我调用 api tts.speech.microsoft.com 时，我收到了一个二进制音频文件，我想将此二进制文件转换为 Base64 字符串。

我一直在尝试很多事情，例如：

Buffer.from(body, "binary").toString("base64");

Run Code Online (Sandbox Code Playgroud)

不起作用。

我不确定“二进制”是确切的词，但它不是一种可读的格式。

感谢您的帮助。

azure node.js azure-speech

Gui*_*meC

lucky-day

3
推荐指数

1
解决办法

9466
查看次数

如何在 JavaScript 中更改 Azure 文本到语音静音超时

我正在使用 Azure SpeechSDK 服务通过recognizeOnceAsync. 当前的代码类似于：

var SpeechSDK, recognizer, synthesizer;
var speechConfig = SpeechSDK.SpeechConfig.fromSubscription('SUB_KEY', 'SUB_REGION');
var audioConfig  = SpeechSDK.AudioConfig.fromDefaultMicrophoneInput();
recognizer = new SpeechSDK.SpeechRecognizer(speechConfig, audioConfig);
new Promise(function(resolve) {
    recognizer.onend = resolve;
    recognizer.recognizeOnceAsync(
        function (result) {
            recognizer.close();
            recognizer = undefined;
            resolve(result.text);
        },
        function (err) {
            alert(err);
            recognizer.close();
            recognizer = undefined;
        }
    );
}).then(r => {
    console.log(`Azure STT enterpreted: ${r}`);
});

Run Code Online (Sandbox Code Playgroud)

在 HTML 文件中，我导入 Azure 包，如下所示：

<script src="https://aka.ms/csspeech/jsbrowserpackageraw"></script>

Run Code Online (Sandbox Code Playgroud)

问题是我想增加方法recognizeOnceAsync返回结果之前允许的“静默时间”量。（也就是说，假设您已经讲完了，您应该能够停下来呼吸一下，而无需使用该方法）。有什么办法可以做到这一点fromDefaultMicrophoneInput吗？我尝试过各种方法，例如：

const SILENCE_UNTIL_TIMEOUT_MS = 5000;
speechConfig.SpeechServiceConnection_EndSilenceTimeoutMs = SILENCE_UNTIL_TIMEOUT_MS;
audioConfig.setProperty("Speech_SegmentationSilenceTimeoutMs", SILENCE_UNTIL_TIMEOUT_MS); …

Run Code Online (Sandbox Code Playgroud)

javascript azure speech-to-text azure-cognitive-services azure-speech

Ste*_*son

lucky-day

3
推荐指数

1
解决办法

1870
查看次数

批量创建转录始终会导致：录音 URI 包含无效数据

我想使用 Azure 语音服务批量转录 API来创建音频文件的转录。我已经成功使用语音服务 SDK（适用于 Node.js），但有兴趣尝试 v3.1 预览版 api ( displayFormWordLevelTimestampsEnabled) 中提供的较新功能之一，所以我想我必须使用REST API 服务来执行此操作。

总的来说，我的问题是，无论我为Create TranscriptAPI提供什么输入contentUrls，我总是会得到相同的错误：

"error": {
   "code": "InvalidData",
   "message": "The recordings URI contains invalid data."
}

Run Code Online (Sandbox Code Playgroud)

经过一番挖掘后，我通过 Azure 门户找到了一些提示，可用于以sox请求的特定格式处理音频文件的转码。

他们在门户文档中提到的具体格式显示：如果您使用 REST API，请确保它使用此表中的格式之一：

格式	编解码器	比特率	采样率
音频格式	相变材料	256kbps	16 kHz，单声道
奥格	奥普斯	256kbps	16 kHz，单声道

sox 的具体命令是：

活动	SoX命令
检查音频文件格式。	红袜--我
将音频文件转换为单声道、16 位、16 KHz。	sox -b 16 -e 有符号整数 -c 1 -r 16k -t wav .wav

我通过第二个命令运行我的 mp3 …

transcription azure-speech

sha*_*ren

2022 09-14

1
推荐指数

1
解决办法

667
查看次数

标签统计

azure-speech ×4

azure ×3

azure-cognitive-services ×2

javascript ×1

node.js ×1

speech-to-text ×1

transcription ×1

如何将 Azure 连续语音识别结果保存在变量中？

二进制文件转base64 nodejs

如何在 JavaScript 中更改 Azure 文本到语音静音超时

批量创建转录始终会导致：录音 URI 包含无效数据

标签 统计

标签统计