所以我的目标是从 .csv 文件打印多个值。我正在尝试找到一种方法,以尽可能短的运行脚本时间尽快完成。
例如,我有一个名为“test.csv”的文件。在“test.csv”中,我有以下值:
0,1673466134,875601111928832,3336977422,22610058C2740,2020-06-03,19:00:01,103,456123489478512
0,6987507655,226102200333225,2312147777,226102E1858F0,2020-06-02,19:00:04,102,112323548998726
0,7891328975,250423212127644,7421354899,22610058C5350,2020-06-01,19:00:00,103,123123489784238
1,1324654889,784502311776287,4778994563,22610058C351E,2020-06-09,19:00:01,102,489123478941324
0,1231324474,247122410577385,1232498779,22610058C53A0,2020-06-07,19:00:00,104,123498715234789
1,4471222598,226912478523771,4123487987,226102C242C40,2020-06-04,19:00:00,103,789123418971354
Run Code Online (Sandbox Code Playgroud)
我需要打印以下值:
例如:计算第一列中为“1”的所有值我会这样做:
cat test1.csv | awk -F ',' '{print $1}' | awk '/^1/' | wc -l
Run Code Online (Sandbox Code Playgroud)
例如:总结第 8 列的所有值,其中第 1 列 = 1
cat test1.csv | awk -F ',' '{print $1,$8}' | awk '/^1/' | awk '{sum+=$2} END {print sum}'
Run Code Online (Sandbox Code Playgroud)
而这样的例子不胜枚举。我有大约 11 个命令可以像上面的命令一样运行。我的目标是将所有这些命令包含在一个脚本文件中,并让它们尽快执行。
我制作了一个看起来像这样的脚本:
#!/bin/bash
while IFS=, read col_1 col_2 col_3 col_4 col_5 col_6 col_7 col_8 col_9
do
echo "No of lines containing 0 on the 1st column: "
awk -F ',' '{print $1}' | awk '/^0/' | wc -l
echo "No of lines containing 1 on the 1st column:"
awk -F ',' '{print $1}' | awk '/^1/' | wc -l
done < test.csv
Run Code Online (Sandbox Code Playgroud)
我遇到的问题是,在执行第一个命令后,无论我在做什么,第二个命令都会显示“0”。
有人可以帮我解决这个问题吗?谢谢!
好的,首先,你不想这样做。awk 比 shell 快几个数量级,因此将 awk 脚本转换为 shell 脚本没有任何好处!忘记外壳,只需在 awk 中完成所有操作。将此文件另存为foo.awk:
#!/bin/awk -f
BEGIN{
FS=","
}
{
if($1~/^0/){zeros++}
if($1~/^1/){ones++}
}
END{
printf "No of lines containing 0 on the 1st column: %d\n", zeros;
printf "No of lines containing 1 on the 1st column: %d\n", ones;
}
Run Code Online (Sandbox Code Playgroud)
使文件可执行,chmod a+x foo.awk然后运行它:
/path/to/foo.awk /path/to/test.csv
Run Code Online (Sandbox Code Playgroud)
如果我在您的示例数据上运行它,我会得到:
$ foo.awk test.csv
No of lines containing 0 on the 1st column: 4
No of lines containing 1 on the 1st column: 2
Run Code Online (Sandbox Code Playgroud)
要在第二个示例中包含该命令,请执行以下操作:
#!/bin/awk -f
BEGIN{
FS=","
}
{
if($1~/^0/){zeros++}
if($1~/^1/){ones++; sum8+=$8}
}
END{
printf "No of lines containing 0 on the 1st column: %d\n", zeros;
printf "No of lines containing 1 on the 1st column: %d\n", ones;
printf "Sum of all 8th fields where the 1st field starts with 1: %d\n", sum8
}
Run Code Online (Sandbox Code Playgroud)
如果您出于某种原因必须使用 shell 脚本,那么让 shell 脚本运行 awk 而不是其他任何事情。不要尝试在 shell 中拆分输入,这很复杂而且很慢。这样的事情要好得多:
#!/bin/bash
awk -F"," '($1~/^0/){zeros++}
($1~/^1/){ones++}
END{
printf "No of lines containing 0 on the 1st column: %d\n", zeros;
printf "No of lines containing 1 on the 1st column: %d\n", ones;
}' "$1"
Run Code Online (Sandbox Code Playgroud)
最后,如果你真的想把它作为单独的命令保存,你可以做这样的事情,但它会非常慢,因为它需要多次读取文件:
#!/bin/bash
echo "No of lines containing 0 on the 1st column: "
awk -F ',' '{print $1}' "$1" | awk '/^0/' | wc -l
echo "No of lines containing 1 on the 1st column:"
awk -F ',' '{print $1}' "$1" | awk '/^1/' | wc -l
echo "Sum of all the 8th columns where the 1st column starts with 1:"
awk -F ',' '/^1/{sum+=$8} END {print sum}' "$1"
Run Code Online (Sandbox Code Playgroud)
然后,您将使文件可执行 ( chmod a+x /path/to/foo.sh) 并像这样运行它:
/path/to/foo.sh /path/to/test.csv
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
1206 次 |
| 最近记录: |