正确使用tensorflows STFT函数

Question

正确使用tensorflows STFT函数

我正在尝试构建一个音频样本的绘图频谱，类似于使用 Audacity 创建的音频样本。从 Audacity 的 wiki 页面来看，绘图频谱（随附示例）执行：

绘制频谱以“大小”样本块的形式获取音频，进行 FFT，并对所有块进行平均。

我想我会使用 Tensorflow 最近提供的 STFT 功能。

我使用的是大小为512的音频块，我的代码如下：

audio_binary = tf.read_file(audio_file)
waveform = tf.contrib.ffmpeg.decode_audio(
    audio_binary,
    file_format="wav",
    samples_per_second=4000,
    channel_count=1
)

stft = tf.contrib.signal.stft(
    waveform,
    512,     # frame_length
    512,     # frame_step
    fft_length=512,
    window_fn=functools.partial(tf.contrib.signal.hann_window, periodic=True), # matches audacity
    pad_end=True,
    name="STFT"
)

Run Code Online (Sandbox Code Playgroud)

但是当我期望每帧（512 个样本）的 FFT 结果时，stft 的结果只是一个空数组

我打电话的方式有什么问题吗？

我已经验证仅使用常规tf.fft函数即可正确读取波形音频数据。

Answer 1

Eli*_*ski 0

audio_file = tf.placeholder(tf.string)

audio_binary = tf.read_file(audio_file)
waveform = tf.contrib.ffmpeg.decode_audio(
    audio_binary,
    file_format="wav",
    samples_per_second=sample_rate,    # Get Info on .wav files (sample rate)
    channel_count=1             # Get Info on .wav files (audio channels)
)

stft = tf.contrib.signal.stft(
    tf.transpose(waveform),
    frame_length,     # frame_lenght, hmmm
    frame_step,     # frame_step, more hmms
    fft_length=fft_length,
    window_fn=functools.partial(tf.contrib.signal.hann_window, 
            periodic=False), # matches audacity
    pad_end=False,
    name="STFT"
)

Run Code Online (Sandbox Code Playgroud)

归档时间：	8 年，6 月前
查看次数：	3267 次
最近记录：	8 年，4 月前