吖奇说*_*HUō 7 audio volume ios avaudioengine avaudiopcmbuffer
我对信号处理几乎一无所知,目前我正在尝试在Swift中实现一个函数,该函数会在声压级增加时触发事件(例如,当人尖叫时)。
我正在通过这样的回调进入AVAudioEngine的输入节点:
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat){
(buffer : AVAudioPCMBuffer?, when : AVAudioTime) in
let arraySize = Int(buffer.frameLength)
let samples = Array(UnsafeBufferPointer(start: buffer.floatChannelData![0], count:arraySize))
//do something with samples
let volume = 20 * log10(floatArray.reduce(0){ $0 + $1} / Float(arraySize))
if(!volume.isNaN){
print("this is the current volume: \(volume)")
}
}
Run Code Online (Sandbox Code Playgroud)
将其转换为浮点数组后,我尝试通过计算平均值来大致估算声压级。
但这给我带来了很大的价值波动,即使iPad只是坐在一个很宽敞的房间里也是如此:
this is the current volume: -123.971
this is the current volume: -119.698
this is the current volume: -147.053
this is the current volume: -119.749
this is the current volume: -118.815
this is the current volume: -123.26
this is the current volume: -118.953
this is the current volume: -117.273
this is the current volume: -116.869
this is the current volume: -110.633
this is the current volume: -130.988
this is the current volume: -119.475
this is the current volume: -116.422
this is the current volume: -158.268
this is the current volume: -118.933
Run Code Online (Sandbox Code Playgroud)
如果我在麦克风附近拍手,此值的确会大大增加。
因此,我可以做一些类似的事情,首先在准备阶段计算这些数量的平均值,然后比较在事件触发阶段差异是否显着增加:
if(!volume.isNaN){
if(isInThePreparingPhase){
print("this is the current volume: \(volume)")
volumeSum += volume
volumeCount += 1
}else if(isInTheEventTriggeringPhase){
if(volume > meanVolume){
//triggers an event
}
}
}
Run Code Online (Sandbox Code Playgroud)
其中,在从准备阶段到触发事件阶段的过渡期间计算averageVolume: meanVolume = volumeSum / Float(volumeCount)
....
但是,如果我除了麦克风之外播放响亮的音乐,似乎并没有明显增加。而且在极少数情况下,甚至volume比meanVolume环境中音量没有明显增加(人耳听得见)的情况下更大。
那么从AVAudioPCMBuffer提取声压级的正确方法是什么?
维基百科给出了这样的公式
其中p是均方根声压,p0是参考声压。
但是我不知道float值AVAudioPCMBuffer.floatChannelData代表什么。苹果页面只说
缓冲区的音频采样为浮点值。
我应该如何与他们合作?
小智 6
感谢@teadrinker 的回复,我终于找到了解决这个问题的方法。我分享了输出AVAudioPCMBuffer输入音量的 Swift 代码:
private func getVolume(from buffer: AVAudioPCMBuffer, bufferSize: Int) -> Float {
guard let channelData = buffer.floatChannelData?[0] else {
return 0
}
let channelDataArray = Array(UnsafeBufferPointer(start:channelData, count: bufferSize))
var outEnvelope = [Float]()
var envelopeState:Float = 0
let envConstantAtk:Float = 0.16
let envConstantDec:Float = 0.003
for sample in channelDataArray {
let rectified = abs(sample)
if envelopeState < rectified {
envelopeState += envConstantAtk * (rectified - envelopeState)
} else {
envelopeState += envConstantDec * (rectified - envelopeState)
}
outEnvelope.append(envelopeState)
}
// 0.007 is the low pass filter to prevent
// getting the noise entering from the microphone
if let maxVolume = outEnvelope.max(),
maxVolume > Float(0.015) {
return maxVolume
} else {
return 0.0
}
}
Run Code Online (Sandbox Code Playgroud)
I think the first step is to get the envelope of the sound. You could use simple averaging to calculate an envelope, but you need to add a rectification step (usually means using abs() or square() to make all samples positive)
More commonly a simple iir-filter is used instead of averaging, with different constants for attack and decay, here is a lab. Note that these constants depend on the sampling frequency, you can use this formula to calculate the constants:
1 - exp(-timePerSample*2/smoothingTime)
Run Code Online (Sandbox Code Playgroud)
When you have the envelope, you can smooth it with an additional filter, and then compare the two envelopes to find a sound that is louder than the baselevel, here's a more complete lab.
请注意,检测音频“事件”可能非常棘手且难以预测,请确保您有大量调试帮助!
| 归档时间: |
|
| 查看次数: |
1290 次 |
| 最近记录: |