spu*_*erh 8 c# sockets streaming speech-recognition sapi
我试图从TCP套接字在C#中进行"流式"语音识别.我遇到的问题是SpeechRecognitionEngine.SetInputToAudioStream()似乎需要一个可以寻找的定义长度的Stream.现在,我能想到的唯一方法就是在更多输入进来时在MemoryStream上重复运行识别器.
这里有一些代码来说明:
SpeechRecognitionEngine appRecognizer = new SpeechRecognitionEngine();
System.Speech.AudioFormat.SpeechAudioFormatInfo formatInfo = new System.Speech.AudioFormat.SpeechAudioFormatInfo(8000, System.Speech.AudioFormat.AudioBitsPerSample.Sixteen, System.Speech.AudioFormat.AudioChannel.Mono);
NetworkStream stream = new NetworkStream(socket,true);
appRecognizer.SetInputToAudioStream(stream, formatInfo);
// At the line above a "NotSupportedException" complaining that "This stream does not support seek operations."
Run Code Online (Sandbox Code Playgroud)
有谁知道怎么解决这个问题?它必须支持某种类型的流输入,因为它使用SetInputToDefaultAudioDevice()与麦克风一起工作正常.
谢谢,肖恩
Sea*_*ean 11
我通过覆盖流类来获得实时语音识别:
class SpeechStreamer : Stream
{
private AutoResetEvent _writeEvent;
private List<byte> _buffer;
private int _buffersize;
private int _readposition;
private int _writeposition;
private bool _reset;
public SpeechStreamer(int bufferSize)
{
_writeEvent = new AutoResetEvent(false);
_buffersize = bufferSize;
_buffer = new List<byte>(_buffersize);
for (int i = 0; i < _buffersize;i++ )
_buffer.Add(new byte());
_readposition = 0;
_writeposition = 0;
}
public override bool CanRead
{
get { return true; }
}
public override bool CanSeek
{
get { return false; }
}
public override bool CanWrite
{
get { return true; }
}
public override long Length
{
get { return -1L; }
}
public override long Position
{
get { return 0L; }
set { }
}
public override long Seek(long offset, SeekOrigin origin)
{
return 0L;
}
public override void SetLength(long value)
{
}
public override int Read(byte[] buffer, int offset, int count)
{
int i = 0;
while (i<count && _writeEvent!=null)
{
if (!_reset && _readposition >= _writeposition)
{
_writeEvent.WaitOne(100, true);
continue;
}
buffer[i] = _buffer[_readposition+offset];
_readposition++;
if (_readposition == _buffersize)
{
_readposition = 0;
_reset = false;
}
i++;
}
return count;
}
public override void Write(byte[] buffer, int offset, int count)
{
for (int i = offset; i < offset+count; i++)
{
_buffer[_writeposition] = buffer[i];
_writeposition++;
if (_writeposition == _buffersize)
{
_writeposition = 0;
_reset = true;
}
}
_writeEvent.Set();
}
public override void Close()
{
_writeEvent.Close();
_writeEvent = null;
base.Close();
}
public override void Flush()
{
}
}
Run Code Online (Sandbox Code Playgroud)
...并使用它的实例作为SetInputToAudioStream方法的流输入.一旦流返回长度或返回的计数小于请求的数量,识别引擎就认为输入已完成.这将设置一个永不完成的循环缓冲区.
您是否尝试过将网络流包装在 System.IO.BufferedStream 中?
NetworkStream netStream = new NetworkStream(socket,true);
BufferedStream buffStream = new BufferedStream(netStream, 8000*16*1); // buffers 1 second worth of data
appRecognizer.SetInputToAudioStream(buffStream, formatInfo);
Run Code Online (Sandbox Code Playgroud)