我正在尝试使用解析一些CSV文件awk.
我正在使用的CSV文件如下所示:
fnName,minAccessTime,maxAccessTime
getInfo,300,600
getStage,600,800
getStage,600,800
getInfo,250,620
getInfo,200,700
getStage,700,1000
getInfo,280,600
Run Code Online (Sandbox Code Playgroud)
我需要在所有数据和单个函数中找到第2列和第3列的最小值,最大值和平均值.
我意识到你并不是在寻找非awk解决方案,但我想我会分享一些R代码来证明总结数据的无缝性.
# read in data
awk <- read.table(textConnection("fnName,minAccessTime,maxAccessTime
getInfo,300,600
getStage,600,800
getStage,600,800
getInfo,250,620
getInfo,200,700
getStage,700,1000
getInfo,280,600"), header = TRUE, sep = ",")
# split according to the function
awk.split <- split(awk, awk$fnName)
# for each function, calculate full summary for columns 2 and 3
lapply(X = awk.split, FUN = function(x) {
summary(x[2:3])
})
Run Code Online (Sandbox Code Playgroud)
结果:
$getInfo
minAccessTime maxAccessTime
Min. :200.0 Min. :600
1st Qu.:237.5 1st Qu.:600
Median :265.0 Median :610
Mean :257.5 Mean :630
3rd Qu.:285.0 3rd Qu.:640
Max. :300.0 Max. :700
$getStage
minAccessTime maxAccessTime
Min. :600.0 Min. : 800.0
1st Qu.:600.0 1st Qu.: 800.0
Median :600.0 Median : 800.0
Mean :633.3 Mean : 866.7
3rd Qu.:650.0 3rd Qu.: 900.0
Max. :700.0 Max. :1000.0
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1446 次 |
| 最近记录: |