从多通道wav文件中读取单个通道

Question

从多通道wav文件中读取单个通道

我需要从wav文件中提取单个通道的样本,该文件最多包含12个(11.1格式)通道.我知道在正常的立体声文件中,样本是交错的,先是左边,然后是右边,就像这样,

[1st L] [1st R] [2nd L] [2nd R]...

Run Code Online (Sandbox Code Playgroud)

那么,要阅读左声道,我会这样做,

for (var i = 0; i < myByteArray.Length; i += (bitDepth / 8) * 2)
{
    // Get bytes and convert to actual samples.
}

Run Code Online (Sandbox Code Playgroud)

为了获得正确的渠道,我只是这样做for (var i = (bitDepth / 8)....

但是,对于超过2个频道的文件,使用了什么顺序？

Answer 1

Sam*_*Sam 33

微软已经创建了一个涵盖多达18个频道的标准.据他们说,wav文件需要有一个特殊的元子块(在"可扩展格式"部分下),它指定一个"通道掩码"(dwChannelMask).该字段长度为4个字节(a uint),其中包含每个通道的相应位,因此指示文件中使用的18个通道中的哪一个.

主通道布局

下面是MCL,即现有信道应该交织的顺序,以及每个信道的比特值.如果某个通道不存在,那么下一个通道将"下拉"到缺失通道的位置,而将使用其订货号,但不会使用位值.(无论通道是否存在,位值对每个通道都是唯一的),

Order | Bit | Channel

 1.     0x1  Front Left
 2.     0x2  Front Right
 3.     0x4  Front Center
 4.     0x8  Low Frequency (LFE)
 5.    0x10  Back Left (Surround Back Left)
 6.    0x20  Back Right (Surround Back Right)
 7.    0x40  Front Left of Center
 8.    0x80  Front Right of Center
 9.   0x100  Back Center
10.   0x200  Side Left (Surround Left)
11.   0x400  Side Right (Surround Right)
12.   0x800  Top Center
13.  0x1000  Top Front Left
14.  0x2000  Top Front Center
15.  0x4000  Top Front Right
16.  0x8000  Top Back Left
17. 0x10000  Top Back Center
18. 0x20000  Top Back Right

Run Code Online (Sandbox Code Playgroud)

例如,如果通道掩码是0x63F(1599),则表示该文件包含8个通道(FL,FR,FC,LFE,BL,BR,SL和SR).

阅读和检查频道掩码

要获得掩模,则需要阅读40 ^日,41 ^日,42 ^次和43 ^次字节(假定为0的基索引,和你正在读一个标准的wav头).例如,

var bytes = new byte[50];

using (var stream = new FileStream("filepath...", FileMode.Open))
{
    stream.Read(bytes, 0, 50);
}

var speakerMask = BitConverter.ToUInt32(new[] { bytes[40], bytes[41], bytes[42], bytes[43] }, 0);

Run Code Online (Sandbox Code Playgroud)

然后,您需要检查所需的通道是否确实存在.为此,我建议创建一个包含所有通道(及其各自值)的enum(定义[Flags]为).

[Flags]
public enum Channels : uint
{
    FrontLeft = 0x1,
    FrontRight = 0x2,
    FrontCenter = 0x4,
    Lfe = 0x8,
    BackLeft = 0x10,
    BackRight = 0x20,
    FrontLeftOfCenter = 0x40,
    FrontRightOfCenter = 0x80,
    BackCenter = 0x100,
    SideLeft = 0x200,
    SideRight = 0x400,
    TopCenter = 0x800,
    TopFrontLeft = 0x1000,
    TopFrontCenter = 0x2000,
    TopFrontRight = 0x4000,
    TopBackLeft = 0x8000,
    TopBackCenter = 0x10000,
    TopBackRight = 0x20000
}

Run Code Online (Sandbox Code Playgroud)

然后最后检查通道是否存在.

如果Channel Mask不存在怎么办？

自己创造一个!根据文件的通道数,您将不得不猜测使用了哪些通道,或者只是盲目地遵循MCL.在下面的代码片段中,我们正在做两件事,

public static uint GetSpeakerMask(int channelCount)
{
    // Assume setup of: FL, FR, FC, LFE, BL, BR, SL & SR. Otherwise MCL will use: FL, FR, FC, LFE, BL, BR, FLoC & FRoC.
    if (channelCount == 8)
    {
        return 0x63F; 
    }

    // Otherwise follow MCL.
    uint mask = 0;
    var channels = Enum.GetValues(typeof(Channels)).Cast<uint>().ToArray();

    for (var i = 0; i < channelCount; i++)
    {
        mask += channels[i];
    }

    return mask;
}

Run Code Online (Sandbox Code Playgroud)

提取样本

要实际读取特定通道的样本,您的操作与文件是立体声的完全相同,也就是说,您按帧大小(以字节为单位)递增循环计数器.

frameSize = (bitDepth / 8) * channelCount

Run Code Online (Sandbox Code Playgroud)

您还需要抵消循环的起始索引.这是事情变得更加复杂的地方,因为您必须根据现有渠道,字节深度,从渠道的订单号开始读取数据.

我的意思是"基于现有渠道"？那么,您需要从1重新分配现有通道的订单号,增加每个通道的订单.例如,通道掩码0x63F指示使用FL,FR,FC,LFE,BL,BR,SL和SR通道,因此各个通道的新通道顺序号看起来像这样(注意,位值不是并且不应该被改变),

Order | Bit | Channel

 1.     0x1  Front Left
 2.     0x2  Front Right
 3.     0x4  Front Center
 4.     0x8  Low Frequency (LFE)
 5.    0x10  Back Left (Surround Back Left)
 6.    0x20  Back Right (Surround Back Right)
 7.   0x200  Side Left (Surround Left)
 8.   0x400  Side Right (Surround Right)

Run Code Online (Sandbox Code Playgroud)

您会注意到FLoC,FRoC和BC都缺失,因此SL&SR通道"下拉"到下一个最低可用订单号,而不是使用SL&SR的默认订单(10,11).

加起来

因此,要读取单个通道的字节,您需要执行与此类似的操作,

// This code will only return the bytes of a particular channel. It's up to you to convert the bytes to actual samples.
public static byte[] GetChannelBytes(byte[] audioBytes, uint speakerMask, Channels channelToRead, int bitDepth, uint sampleStartIndex, uint sampleEndIndex)
{
    var channels = FindExistingChannels(speakerMask);
    var ch = GetChannelNumber(channelToRead, channels);
    var byteDepth = bitDepth / 8;
    var chOffset = ch * byteDepth;
    var frameBytes = byteDepth * channels.Length;
    var startByteIncIndex = sampleStartIndex * byteDepth * channels.Length;
    var endByteIncIndex = sampleEndIndex * byteDepth * channels.Length;
    var outputBytesCount = endByteIncIndex - startByteIncIndex;
    var outputBytes = new byte[outputBytesCount / channels.Length];
    var i = 0;

    startByteIncIndex += chOffset;

    for (var j = startByteIncIndex; j < endByteIncIndex; j += frameBytes)
    {
        for (var k = j; k < j + byteDepth; k++)
        {
            outputBytes[i] = audioBytes[(k - startByteIncIndex) + chOffset];
            i++;
        }
    }

    return outputBytes;
}

private static Channels[] FindExistingChannels(uint speakerMask)
{
    var foundChannels = new List<Channels>();

    foreach (var ch in Enum.GetValues(typeof(Channels)))
    {
        if ((speakerMask & (uint)ch) == (uint)ch)
        {
            foundChannels.Add((Channels)ch);
        }
    }

    return foundChannels.ToArray();
}

private static int GetChannelNumber(Channels input, Channels[] existingChannels)
{
    for (var i = 0; i < existingChannels.Length; i++)
    {
        if (existingChannels[i] == input)
        {
            return i;
        }
    }

    return -1;
}

Run Code Online (Sandbox Code Playgroud)

有趣的事实:你将把[SO]的最高奖励答案增加到700 ... https://data.stackexchange.com/stackoverflow/query/5400/bounty-award-counts (3认同)

归档时间：	11 年，5 月前
查看次数：	2388 次
最近记录：	8 年，3 月前