将多声道PyAudio转换为NumPy数组

end*_*ith 7 python numpy pyaudio

我能找到的所有例子都是单声道的CHANNELS = 1.如何使用PyAudio中的回调方法读取立体声或多声道输入并将其转换为2D NumPy数组或多个1D数组?

对于单声道输入,这样的工作:

def callback(in_data, frame_count, time_info, status):
    global result
    global result_waiting

    if in_data:
        result = np.fromstring(in_data, dtype=np.float32)
        result_waiting = True
    else:
        print('no input')

    return None, pyaudio.paContinue

stream = p.open(format=pyaudio.paFloat32,
                channels=1,
                rate=fs,
                output=False,
                input=True,
                frames_per_buffer=fs,
                stream_callback=callback)
Run Code Online (Sandbox Code Playgroud)

但是对于立体声输入不起作用,result阵列是两倍长,所以我假设通道是交错的或者其他东西,但我找不到这方面的文档.

end*_*ith 13

它似乎是逐个样本交错的,左声道是第一个.左声道输入信号和右声道静音,我得到:

result = [0.2776, -0.0002,  0.2732, -0.0002,  0.2688, -0.0001,  0.2643, -0.0003,  0.2599, ...
Run Code Online (Sandbox Code Playgroud)

因此,要将其分离为立体声流,请重塑为2D阵列:

result = np.fromstring(in_data, dtype=np.float32)
result = np.reshape(result, (frames_per_buffer, 2))
Run Code Online (Sandbox Code Playgroud)

现在要访问左声道,使用result[:, 0],并使用右声道result[:, 1].

def decode(in_data, channels):
    """
    Convert a byte stream into a 2D numpy array with 
    shape (chunk_size, channels)

    Samples are interleaved, so for a stereo stream with left channel 
    of [L0, L1, L2, ...] and right channel of [R0, R1, R2, ...], the output 
    is ordered as [L0, R0, L1, R1, ...]
    """
    # TODO: handle data type as parameter, convert between pyaudio/numpy types
    result = np.fromstring(in_data, dtype=np.float32)

    chunk_length = len(result) / channels
    assert chunk_length == int(chunk_length)

    result = np.reshape(result, (chunk_length, channels))
    return result


def encode(signal):
    """
    Convert a 2D numpy array into a byte stream for PyAudio

    Signal should be a numpy array with shape (chunk_size, channels)
    """
    interleaved = signal.flatten()

    # TODO: handle data type as parameter, convert between pyaudio/numpy types
    out_data = interleaved.astype(np.float32).tostring()
    return out_data
Run Code Online (Sandbox Code Playgroud)

  • “交错”是什么意思?我玩过这些东西,`flatten` 函数实际上是一个解决方案,但是没有参数的 `flatten` 将二维数组展平为一维,但第一行的所有值都在第二行的所有值之前。在 [`numpy` 文档](https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.flatten.html) 中,我发现你可以提供 `'F'` 字符作为第一个参数和它以我们期望的方式执行扁平化。它等价于你的 `interleaved.astype(np.float32).tostring()` 调用吗?如果是,它看起来是最简单的解决方案。 (2认同)