如何以编程方式训练SpeechRecognitionEngine并将音频文件转换为C#或vb.net中的文本

Question

如何以编程方式训练SpeechRecognitionEngine并将音频文件转换为C#或vb.net中的文本

cMi*_*nor 11 .net c# vb.net speech-recognition

是否有可能以编程方式训练识别器给出.wavs而不是与麦克风交谈？

如果是这样,怎么做？,目前我有代码在0.wav文件中对音频执行识别,并将识别的文本写入控制台.

Imports System.IO
Imports System.Speech.Recognition
Imports System.Speech.AudioFormat

Namespace SampleRecognition
    Class Program
        Shared completed As Boolean

        Public Shared Sub Main(ByVal args As String())
            Using recognizer As New SpeechRecognitionEngine()
                Dim dictation As Grammar = New DictationGrammar()
                dictation.Name = "Dictation Grammar"
                recognizer.LoadGrammar(dictation)
                ' Configure the input to the recognizer.
                recognizer.SetInputToWaveFile("C:\Users\ME\v02\0.wav")

                ' Attach event handlers for the results of recognition.
                AddHandler recognizer.SpeechRecognized, AddressOf recognizer_SpeechRecognized
                AddHandler recognizer.RecognizeCompleted, AddressOf recognizer_RecognizeCompleted

                ' Perform recognition on the entire file.
                Console.WriteLine("Starting asynchronous recognition...")
                completed = False
                recognizer.RecognizeAsync()
                ' Keep the console window open.
                While Not completed
                    Console.ReadLine()
                End While
                Console.WriteLine("Done.")
            End Using

            Console.WriteLine()
            Console.WriteLine("Press any key to exit...")
            Console.ReadKey()
        End Sub

        ' Handle the SpeechRecognized event.
        Private Shared Sub recognizer_SpeechRecognized(ByVal sender As Object, ByVal e As SpeechRecognizedEventArgs)
            If e.Result IsNot Nothing AndAlso e.Result.Text IsNot Nothing Then
                Console.WriteLine("  Recognized text =  {0}", e.Result.Text)
            Else
                Console.WriteLine("  Recognized text not available.")
            End If
        End Sub

        ' Handle the RecognizeCompleted event.
        Private Shared Sub recognizer_RecognizeCompleted(ByVal sender As Object, ByVal e As RecognizeCompletedEventArgs)
            If e.[Error] IsNot Nothing Then
                Console.WriteLine("  Error encountered, {0}: {1}", e.[Error].[GetType]().Name, e.[Error].Message)
            End If
            If e.Cancelled Then
                Console.WriteLine("  Operation cancelled.")
            End If
            If e.InputStreamEnded Then
                Console.WriteLine("  End of stream encountered.")
            End If
            completed = True
        End Sub
    End Class
End Namespace

Run Code Online (Sandbox Code Playgroud)

编辑

我知道使用Training向导对此非常有用

通过打开语音识别,单击开始按钮 - >控制面板 - >易于访问 - >语音识别来完成

.

如何使用自定义wav甚至mp3文件自定义语音识别？

使用培训向导(控制面板培训UI)时,培训文件存储在 {AppData}\Local\Microsoft\Speech\Files\TrainingAudio中.

如何使用或进行自定义培训而不是使用培训向导？

该语音控制面板在关键的训练音频文件创建注册表项HKCU \软件\微软\语音\ RecoProfiles\{令牌ProfileGUID} {00000000-0000-0000-0000-0000000000000000} \文件

代码创建的注册表项是否必须放在那里？

这样做的原因是我想用自己的wav文件和单词和短语列表自定义训练,然后将所有内容传输到其他系统.

Answer 1

小智 5

使用C#训练SAPI当然是可能的.您可以使用围绕SAPI的speechlib包装器从C#访问培训模式API.@ Eric Brown回答了该过程

创建一个inproc识别器并绑定适当的音频输入.
确保您保留音频以供您识别; 你以后需要它.
创建包含要训练的文本的语法.
设置语法的状态以在识别发生时暂停识别器.(这也有助于从音频文件进行培训.)

识别时:
获取已识别的文本和保留的音频.
使用CoCreateInstance(CLSID_SpStream)创建流对象.
使用ISpRecognizer :: GetObjectToken和ISpObjectToken :: GetStorageFileName创建训练音频文件,并将其绑定到流(使用ISpStream :: BindToFile).
将保留的音频复制到流对象中.
QI是ISpTranscript接口的流对象,并使用ISpTranscript :: AppendTranscript将识别的文本添加到流中.
更新下一个话语的语法,恢复识别器,然后重复,直到您没有训练文本.

其他选项可能是使用所需的输出训练sapi一次,然后使用代码获取配置文件并将其传输到其他系统,以下代码返回一个ISpeechObjectTokens对象:

GetProfiles方法返回可用用户语音配置文件的选择.配置文件作为一系列令牌存储在语音配置数据库中,每个令牌代表一个配置文件.GetProfiles检索所有可用的配置文件令牌.返回的列表是ISpeechObjectTokens对象.有关令牌的其他或更详细信息可在与ISpeechObjectTokens相关的方法中获得.可以使用RequiredAttributes和OptionalAttributes搜索属性进一步细化令牌搜索.仅返回与指定的RequiredAttributes搜索属性匹配的标记.在与RequiredAttributes键匹配的令牌中,OptionalAttributes按与OptionalAttributes匹配的顺序列出设备.如果未提供搜索属性,则返回所有标记.如果没有符合条件的音频设备,GetAudioInputs将返回一个空选择,即ISpeechObjectTokens :: Count属性为零的ISpeechObjectTokens集合.有关SAPI 5定义属性的列表,请参阅对象标记和注册表设置白皮书.

Public SharedRecognizer As SpSharedRecognizer
Public theRecognizers As ISpeechObjectTokens

Private Sub Command1_Click()
    On Error GoTo EH

    Dim currentProfile As SpObjectToken
    Dim i As Integer
    Dim T As String
    Dim TokenObject As ISpeechObjectToken
    Set currentProfile = SharedRecognizer.Profile

    For i = 0 To theRecognizers.Count - 1
        Set TokenObject = theRecognizers.Item(i)

        If tokenObject.Id <> currentProfile.Id Then
            Set SharedRecognizer.Profile = TokenObject
            T = "New Profile installed: "
            T = T & SharedRecognizer.Profile.GetDescription
            Exit For
        Else
            T = "No new profile has been installed."
        End If
    Next i

    MsgBox T, vbInformation

EH:
    If Err.Number Then ShowErrMsg
End Sub

Private Sub Form_Load()
    On Error GoTo EH

    Const NL = vbNewLine
    Dim i, idPosition As Long
    Dim T As String
    Dim TokenObject As SpObjectToken

    Set SharedRecognizer = CreateObject("SAPI.SpSharedRecognizer")
    Set theRecognizers = SharedRecognizer.GetProfiles

    For i = 0 To theRecognizers.Count - 1
        Set TokenObject = theRecognizers.Item(i)
        T = T & TokenObject.GetDescription & "--" & NL & NL
        idPosition = InStrRev(TokenObject.Id, "\")
        T = T & Mid(TokenObject.Id, idPosition + 1) & NL
    Next i

    MsgBox T, vbInformation

EH:
    If Err.Number Then ShowErrMsg
End Sub

Private Sub ShowErrMsg()

    ' Declare identifiers:
    Dim T As String

    T = "Desc: " & Err.Description & vbNewLine
    T = T & "Err #: " & Err.Number
    MsgBox T, vbExclamation, "Run-Time Error"
    End

End Sub

Run Code Online (Sandbox Code Playgroud)

Answer 2

Ben*_*nny 2

您可以使用 SAPI 引擎（而不是托管 api）生成自定义训练

这是有关如何执行此操作的链接（虽然有点模糊）

归档时间：	12 年，10 月前
查看次数：	4558 次
最近记录：	6 年，11 月前