标签: soundfile

如何消除由 librosa griffin lim 引入的失真？

我正在做：

import librosa


D = librosa.stft(samples, n_fft=nperseg, 
                 hop_length=overlap, win_length=nperseg,
                 window=scipy.signal.windows.hamming)

spect, _ = librosa.magphase(D)

audio_signal = librosa.griffinlim(spect, n_iter=1024, 
                                  win_length=nperseg, hop_length=overlap, 
                                  window=signal.windows.hamming)
print(audio_signal, audio_signal.shape)
sf.write('test.wav', audio_signal, sample_rate)

Run Code Online (Sandbox Code Playgroud)

并且它在重建的音频信号中引入了明显的失真。我能做些什么来改善它？

python librosa soundfile

Sha*_*oon

2020 04-02

32
推荐指数

1
解决办法

740
查看次数

AttributeError: cffi 库 '(pyModulesPath)\_soundfile_data\libsndfile64bit.dll' 没有名为 'sf_wchar_open' 的函数、常量或全局变量

当我尝试使用与 librosa 模块相关的任何内容时，出现错误：

Traceback (most recent call last):
  File "C:\Users\User1\Documents\test3.py", line 36, in <module>
    x, Fs = librosa.load(fn_mp3, sr=None)
  File "C:\Program Files\Python38\lib\site-packages\librosa\core\audio.py", line 129, in load
    with sf.SoundFile(path) as sf_desc:
  File "C:\Program Files\Python38\lib\site-packages\soundfile.py", line 629, in __init__
    self._file = self._open(file, mode_int, closefd)
  File "C:\Program Files\Python38\lib\site-packages\soundfile.py", line 1172, in _open
    openfunction = _snd.sf_wchar_open
AttributeError: cffi library 'C:\Program Files\Python38\lib\site-packages\_soundfile_data\libsndfile64bit.dll' has no function, constant or global variable named 'sf_wchar_open'

Run Code Online (Sandbox Code Playgroud)

在出现错误之前，我libsndfile64bit.dll在站点包中创建了一个名为_soundfile_data的文件夹，并libsndfile64bit.dll从此处下载，然后将其添加到该文件夹中，然后我提供的错误弹出。我曾尝试在 SO 上搜索答案，但没有相关问题，我无法编辑，libsndfile64bit.dll因此我无能为力。我使用的是 Windows …

python audio python-3.x librosa soundfile

SF1*_*udy

2020 03-13

5
推荐指数

2
解决办法

4470
查看次数

获取 soundfile.LibsndfileError：打开“speech.wav”时出错：将 2D numpy 数组提供给声音文件时无法识别格式

在遇到错误之前尝试从 NVIDIA TTS nemo 模型生成的张量生成音频：

这是它的代码：

import soundfile as sf

from nemo.collections.tts.models import FastPitchModel
from nemo.collections.tts.models import HifiGanModel

spec_generator = FastPitchModel.from_pretrained("tts_en_fastpitch")
vocoder = HifiGanModel.from_pretrained(model_name="tts_hifigan")

text = "Just keep being true to yourself, if you're passionate about something go for it. Don't sacrifice anything, just have fun."
parsed = spec_generator.parse(text)
spectrogram = spec_generator.generate_spectrogram(tokens=parsed)
audio = vocoder.convert_spectrogram_to_audio(spec=spectrogram)
audio = audio.to('cpu').detach().numpy()

sf.write("speech.wav", audio, 22050)

Run Code Online (Sandbox Code Playgroud)

期望获得音频文件speech.wav

libsndfile python-3.x soundfile

Jac*_*iti

lucky-day

3
推荐指数

1
解决办法

1万
查看次数