我有多个CSV文件,其名称中包含4个常用字符.我想知道如何使用相同的常用字符来处理文件.例如,"AM-25"在3个csv文件的名称中是常见的,而"BA-35"在另一个2的名称中是常见的.
文件类似于AM-25.myfiles.2000.csv,AM-25.myfiles.2001.csv,AM-25.myfiles.2002.csv,BA-35.myfiles.2000.csv,BA-35.myfiles .2001.csv,我用这个来读取所有文件:
files <- list.files(path=".", pattern="xyz+.csv", all.files = FALSE,full.names=TRUE )
Run Code Online (Sandbox Code Playgroud)
你在找这样的东西吗?
do.call(rbind, lapply(list.files(path=".", pattern="AM-25"), read.table, header=TRUE, sep=","))
Run Code Online (Sandbox Code Playgroud)
这会将从包含字符"AM-25"的csv文件中读取的矩阵组合在一起.read.table根据您的csv文件,参数可能会有所不同.
编辑
我希望这适用于您不知道目录中所有可能的五个字母前缀的文件名的情况:
##Get all different first five letter strings for all cvs files in directory "."
file.prefixes <- unique(sapply(list.files(path=".", pattern="*.csv"), substr, 1,5))
##Group all matching file names according to file.prefixes into a list
file.list <- lapply(file.prefixes, function(x)list.files(pattern=paste("^",x,".*.csv",sep=""), path="."))
names(file.list) <- file.prefixes ##just for convenience
##parse all csv files in file.list, create a list of lists containing all tables for each prefix
tables <- lapply(file.list, function(filenames)lapply(filenames, function(file)read.table(file, header=TRUE)))
##for each prefix, rbind the tables. Result is a list of length being length(file.prefixes)
## each containing a matrix with the combined data parsed from the files that match the prefix
joined.tables <- lapply(tables, function(t)do.call(rbind, t))
##Save tables to files
for (prefix in names(joined.tables))write.table(joined.tables[[prefix]], paste(prefix, ".csv", sep=""))
Run Code Online (Sandbox Code Playgroud)