如何在 Python 中放大声音而不失真

The*_*rns 1 python audio

我正在尝试对声音文件进行幼稚的音量调整。我正在使用 python 2.7 和以下

图书馆:

import numpy as np

import scipy.io.wavfile as wv

import matplotlib.pyplot as plt

import pyaudio  

import wave  
Run Code Online (Sandbox Code Playgroud)

我尝试了 2 种方法,我试图将声音放大 2 倍,即。n=2。第一个是从这里改变的动态范围限制器方法(http://bastibe.de/2012-11-02-real-time-signal-processing-in-python.html):

def limiter(self, n):

    #best version so far

    signal=self.snd_array

    attack_coeff = 0.01

    framemax=2**15-1

    threshold=framemax

    for i in np.arange(len(signal)):

    #if amplitude value * amplitude gain factor is > threshold set an interval to decrease the amplitude            

        if signal[i]*n > threshold:

            gain=1

            jmin=0

            jmax=0                

            if i-100>0: 

                jmin=i-100

            else:

                jmin=0

            if i+100<len(signal):

                jmax=i+100

            else:

                jmax=len(signal)

            for j in range(jmin,jmax):    

                #target gain is amplitude factor times exponential to smoothly decrease the amp factor (n)

                target_gain = n*np.exp(-10*(j-jmin))

                gain = (gain*attack_coeff + target_gain*(1-attack_coeff))

                signal[j]=signal[j]*gain

        else:

            signal[i] = signal[i]*n

    print max(signal),min(signal)

    plt.figure(3)

    plt.plot(signal)

    return signal
Run Code Online (Sandbox Code Playgroud)

第二种方法是我进行硬膝压缩以将声音值的幅度降低到阈值以上,然后通过幅度增益因子放大整个信号。

def compress(self,n):

     print 'start compress'

     threshold=2**15/n+1000

     #compress all values above the threshold, therefore limiting the audio amplitude range

     for i in np.arange(len(self.snd_array)):         

         if abs(self.snd_array[i])>threshold:

             factor=1+(threshold-abs(self.snd_array[i]))/threshold

         else:

             factor=1.0

     #apply compression factor and amp gain factor (n)

         self.snd_array[i] = self.snd_array[i]*factor*n

     print np.min(self.snd_array),np.max(self.snd_array)

     plt.figure(2)

     plt.plot(self.snd_array,'k')

     return self.snd_array
Run Code Online (Sandbox Code Playgroud)

在这两种方法中,文件听起来都失真了。在振幅接近阈值的点上,音乐听起来会被剪断和噼啪作响。我认为这是因为它在阈值附近“变平”了。我尝试在限制器函数中应用指数,但即使我让它很快降低,它也不会完全消除噼啪声。如果我更改 n=1.5,声音不会失真。如果有人能给我任何关于如何消除噼啪声失真或链接到其他音量调制代码的任何指示,我将不胜感激。

Fra*_*kow 10

它可能不是 100% 的主题,但也许这对你来说很有趣。如果您不需要进行实时处理,事情可以变得更容易。限制和动态压缩可以看作是应用动态传递函数。这个函数只是将输入映射到输出值。然后线性函数返回原始音频,“曲线”函数进行压缩或扩展。应用传递函数就像

import numpy as np
from scipy.interpolate import interp1d
from scipy.io import wavfile

def apply_transfer(signal, transfer, interpolation='linear'):
    constant = np.linspace(-1, 1, len(transfer))
    interpolator = interp1d(constant, transfer, interpolation)
    return interpolator(signal)
Run Code Online (Sandbox Code Playgroud)

限制或压缩只是选择不同传递函数的一种情况:

# hard limiting
def limiter(x, treshold=0.8):
    transfer_len = 1000
    transfer = np.concatenate([ np.repeat(-1, int(((1-treshold)/2)*transfer_len)),
                                np.linspace(-1, 1, int(treshold*transfer_len)),
                                np.repeat(1, int(((1-treshold)/2)*transfer_len)) ])
    return apply_transfer(x, transfer)

# smooth compression: if factor is small, its near linear, the bigger it is the
# stronger the compression
def arctan_compressor(x, factor=2):
    constant = np.linspace(-1, 1, 1000)
    transfer = np.arctan(factor * constant)
    transfer /= np.abs(transfer).max()
    return apply_transfer(x, transfer)
Run Code Online (Sandbox Code Playgroud)

此示例假定 16 位单声道 wav 文件作为输入:

sr, x = wavfile.read("input.wav")
x = x / np.abs(x).max() # x scale between -1 and 1

x2 = limiter(x)
x2 = np.int16(x2 * 32767)
wavfile.write("output_limit.wav", sr, x2)

x3 = arctan_compressor(x)
x3 = np.int16(x3 * 32767)
wavfile.write("output_comp.wav", sr, x3)
Run Code Online (Sandbox Code Playgroud)

也许这个干净的离线代码可以帮助您对实时代码进行基准测试。