API或SDK仅对数字(1到10000之间)进行语音识别?

fvi*_*cot 5 speech-recognition speech speech-to-text

我需要一个经过优化的专门解决方案,可以检测智能手机上使用的 1 到 1000 之间的数字。最好的解决方案是让这个 SDK 离线工作。任何想法 ?我没有找到任何允许“仅数字”的 Google Speech 或 Amazon Transcribe 配置

Nik*_*rev 1

严格期望人们提供数字是不太正确的,即使你向他们询问数字,他们通常也会说很多诸如“我不知道”或“稍等一下”之类的话。您将严重损害体验。

你必须智能地分析识别结果,即使识别到非数字,你也必须采取相应的行动。

To improve accuracy for numbers specifically you can use word hint feature of Google Speech API. Just add digits and other required words as a hint and Google will recognize them much more accurately. Amazon also has this feature they call it "custom vocabulary".

If you want to use an offline API you can certainly try Kaldi. You can adapt Kaldi vocabulary with numbers to improve the accuracy, it will be much better than Google API.