在 TensorFlow 中过滤音频信号

Question

在 TensorFlow 中过滤音频信号

Tri*_*ops 7 python signal-processing dataset scipy tensorflow

我正在构建一个基于音频的深度学习模型。作为预处理的一部分，我想增强数据集中的音频。我想做的一项增强功能是应用 RIR（房间脉冲响应）函数。我正在与Python 3.9.5和一起工作TensorFlow 2.8。

在 Python 中，标准方法是，如果 RIR 给出为n 个抽头的有限脉冲响应 (FIR)，则使用SciPy lfilter

import numpy as np
from scipy import signal
import soundfile as sf

h = np.load("rir.npy")
x, fs = sf.read("audio.wav")

y = signal.lfilter(h, 1, x)

Run Code Online (Sandbox Code Playgroud)

在所有文件上循环运行可能需要很长时间。使用 TensorFlowmap实用程序在 TensorFlow 数据集上执行此操作：

# define filter function
def h_filt(audio, label):
    h = np.load("rir.npy")
    x = audio.numpy()
    y = signal.lfilter(h, 1, x)
    return tf.convert_to_tensor(y, dtype=tf.float32), label

# apply it via TF map on dataset
aug_ds = ds.map(h_filt)

Run Code Online (Sandbox Code Playgroud)

使用tf.numpy_function：

tf_h_filt = tf.numpy_function(h_filt, [audio, label], [tf.float32, tf.string])

# apply it via TF map on dataset
aug_ds = ds.map(tf_h_filt)

Run Code Online (Sandbox Code Playgroud)

我有两个问题：

这种方式是否正确且足够快（50,000 个文件不到一分钟）？
有更快的方法吗？例如，用内置的 TensforFlow 函数替换 SciPy 函数。我没有找到SciPy 的 convolvelfilter的等效项。

Answer 1

Bob*_*Bob 4

这是你可以做的一种方法

请注意，张量流函数被设计为接收具有多个通道的批量输入，并且过滤器可以具有多个输入通道和多个输出通道。设N为批量大小I、输入通道数、F滤波器宽度、L输入宽度和O 输出通道数。使用padding='SAME'它将 shape 的输入(N, L, I)和 shape 的过滤器映射(F, I, O)到 shape 的输出(N, L, O)。

import numpy as np
from scipy import signal
import tensorflow as tf

# data to compare the two approaches
x = np.random.randn(100)
h = np.random.randn(11)

# h
y_lfilt = signal.lfilter(h, 1, x)

# Since the denominator of your filter transfer function is 1
# the output of lfiler matches the convolution
y_np = np.convolve(h, x)
assert np.allclose(y_lfilt, y_np[:len(y_lfilt)])

# now let's do the convolution using tensorflow
y_tf = tf.nn.conv1d(
    # x must be padded with half of the size of h
    # to use padding 'SAME'
    np.pad(x, len(h) // 2).reshape(1, -1, 1), 
    # the time axis of h must be flipped
    h[::-1].reshape(-1, 1, 1), # a 1x1 matrix of filters
    stride=1, 
    padding='SAME', 
    data_format='NWC')

assert np.allclose(y_lfilt, np.squeeze(y_tf)[:len(y_lfilt)])

Run Code Online (Sandbox Code Playgroud)

归档时间：	4 年前
查看次数：	1054 次
最近记录：	4 年前