在 R 中有效地分割大型音频文件

Question

在 R 中有效地分割大型音频文件

Jot*_*ota 5 audio performance file-io split r

之前我问过这个关于分割音频文件的问题。我从 @Jean V. Adams 得到的答案对于小声音对象来说相对有效（缺点：输入是立体声，输出是单声道，而不是立体声）：

library(seewave)

# your audio file (using example file from seewave package)
data(tico)
audio <- tico # this is an S4 class object
# the frequency of your audio file
freq <- 22050
# the length and duration of your audio file
totlen <- length(audio)
totsec <- totlen/freq

# the duration that you want to chop the file into
seglen <- 0.5

# defining the break points
breaks <- unique(c(seq(0, totsec, seglen), totsec))
index <- 1:(length(breaks)-1)
# a list of all the segments
subsamps <- lapply(index, function(i) cutw(audio, f=freq, from=breaks[i], to=breaks[i+1]))

Run Code Online (Sandbox Code Playgroud)

我将此解决方案应用于我准备分析的文件之一（大约 300 个）（约 150 MB），并且我的计算机处理了该解决方案（现在超过 5 小时），但我最终关闭了之前的会话完成的。

有谁有任何想法或解决方案来有效地执行使用 R 将大型音频文件（特别是 S4 类 Wave 对象）分割成较小部分的任务？我希望大幅减少从这些较大文件中生成较小文件所需的时间，并且我希望使用 R。但是，如果我无法让 R 有效地完成任务，我将不胜感激用于该工作的其他工具。上面的示例数据是单声道的，但我的数据是立体声的。可以使用以下方法将示例数据设为立体数据：

tico@stereo <- TRUE
tico@right <- tico@left

Run Code Online (Sandbox Code Playgroud)

更新

我确定了另一个基于第一个解决方案的解决方案：

lapply(index, function(i) audio[(breaks[i]*freq):(breaks[i+1]*freq)])

Run Code Online (Sandbox Code Playgroud)

比较三种解决方案的性能：

# Solution suggested by @Jean V. Adams
system.time(replicate(100,lapply(index, function(i) cutw(audio, f=freq, from=breaks[i], to=breaks[i+1], output="Wave"))))
user  system elapsed 
1.19    0.00    1.19 
# my modification of the previous solution
system.time(replicate(100,lapply(index, function(i) audio[(breaks[i]*freq):(breaks[i+1]*freq)])))
user  system elapsed 
0.86    0.00    0.85 

# solution suggested by @CarlWitthoft 
audiomod <- audio[(freq*breaks[1]):(freq*breaks[length(breaks)-1])] # remove unequal part at end
system.time(replicate(100,matrix(audiomod@left,ncol=length(breaks))))+
system.time(replicate(100,matrix(audiomod@right,ncol=length(breaks))))
user  system elapsed 
0.25    0.00    0.26

Run Code Online (Sandbox Code Playgroud)

使用索引的方法（即[）似乎更快（3-4 倍）。@CarlWitthoft 的解决方案甚至更快，缺点是它将数据放入矩阵而不是多个Wave对象中，我将使用writeWave. Wave据推测，如果我正确理解如何创建这种类型的 S4 对象，从矩阵格式转换为单独的对象将相对简单。还有进一步改进的空间吗？

Answer 1

Jot*_*ota 5

我最终使用的方法是基于@CarlWitthoft 和@JeanV.Adams 提供的解决方案构建的。与我使用的其他技术相比，它的速度相当快，并且它使我能够在几个小时（而不是几天）内分割大量文件。

例如，以下是使用小型 Wave 对象的整个过程（我当前的音频文件大小最大为 150 MB，但将来，我可能会收到更大的文件（即涵盖 12-24 小时录音的声音文件），其中内存管理将变得更加重要）：

library(seewave)
library(tuneR)

data(tico)

# force to stereo
tico@stereo <- TRUE
tico@right <- tico@left    
audio <- tico # this is an S4 class object


# the frequency of your audio file
freq <- 22050
# the length and duration of your audio file
totlen <- length(audio)
totsec <- totlen/freq 

# the duration that you want to chop the file into (in seconds)
seglen <- 0.5

# defining the break points
breaks <- unique(c(seq(0, totsec, seglen), totsec))
index <- 1:(length(breaks)-1)

# the split
leftmat<-matrix(audio@left, ncol=(length(breaks)-2), nrow=seglen*freq) 
rightmat<-matrix(audio@right, ncol=(length(breaks)-2), nrow=seglen*freq)
# the warnings are nothing to worry about here... 

# convert to list of Wave objects.
subsamps0409_180629 <- lapply(1:ncol(leftmat), function(x)Wave(left=leftmat[,x],
         right=rightmat[,x], samp.rate=d@samp.rate,bit=d@bit)) 


# get the last part of the audio file.  the part that is < seglen
lastbitleft <- d@left[(breaks[length(breaks)-1]*freq):length(d)]
lastbitright <- d@right[(breaks[length(breaks)-1]*freq):length(d)]

# convert and add the last bit to the list of Wave objects
subsamps0409_180629[[length(subsamps0409_180629)+1]] <- 
     Wave(left=lastbitleft, right=lastbitright, samp.rate=d@samp.rate, bit=d@bit)

Run Code Online (Sandbox Code Playgroud)

这不是我最初问题的一部分，但我的最终目标是保存这些新的、更小的 Wave 对象。

# finally, save the Wave objects
setwd("C:/Users/Whatever/Wave_object_folder")

# I had some memory management issues on my computer when doing this
# process with large (~ 130-150 MB) audio files so I used rm() and gc(),
# which seemed to resolve the problems I had with allocating memory.
rm("breaks","audio","freq","index","lastbitleft","lastbitright","leftmat",
  "rightmat","seglen","totlen","totsec")

gc()

filenames <- paste("audio","_split",1:(length(breaks)-1),".wav",sep="")

# Save the files
sapply(1:length(subsamps0409_180629),
       function(x)writeWave(subsamps0409_180629[[x]], 
       filename=filenames[x]))

Run Code Online (Sandbox Code Playgroud)

这里唯一真正的缺点是输出文件非常大。例如，我放入一个 130 MB 的文件，并将其拆分为 18 个文件，每个文件大约 50 MB。我认为这是因为我的输入文件是 .mp3，输出是 .wav。我将这个答案发布到我自己的问题中，以便用我用来解决它的完整解决方案来解决我遇到的问题，但是其他答案很受欢迎，我将花时间查看每个解决方案并评估它们提供的内容。我确信有更好的方法来完成这项任务，并且方法可以更好地处理非常大的音频文件。在解决这个问题时，我仅仅触及了内存管理的皮毛。

归档时间：	11 年，10 月前
查看次数：	2727 次
最近记录：	11 年，10 月前