我有一个问题,我试图解决使用只有awk.
我在结构中有一个csv文件:
Easting Northing Latitude Longitude Locality Name
Easting "Northing" "Latitude" "Longitude" "LocalityName"
364208 176288 51.48441 -2.51685 "Fishponds"
358596 172813 51.45278 -2.59726 "Bristol City Centre"
358886 177828 51.49789 -2.59367 "Southmead"
358839 177839 51.49798 -2.59435 "Southmead"
358980 177882 51.49838 -2.59232 "Southmead"
359009 177863 51.49821 -2.5919 "Southmead"
358839 177529 51.4952 -2.59431 "Southmead"
359475 168262 51.41192 -2.58409 "Hengrove Park"
358945 173526 51.45921 -2.59232 "Bristol"
358943 173525 51.4592 -2.59235 "Bristol"
358941 173524 51.45919 -2.59238 "Bristol"
358940 173523 51.45919 -2.59239 "Bristol"
358945 173528 51.45923 -2.59232 "Bristol"
358936 173520 51.45916 -2.59245 "Bristol"
358936 173521 51.45917 -2.59245 "Bristol"
358932 173516 51.45912 -2.5925 "Bristol"
Run Code Online (Sandbox Code Playgroud)
等...我正在尝试编写一个awk脚本,它将计算每个Locality名称的实例和打印打印这样一个输出将是:
Fishponds 1
Bristol City Centre 1
Southmead 5
Hengrove park 1
Bristol 8
Run Code Online (Sandbox Code Playgroud)
到目前为止,我有这个:
BEGIN { i = 0; state = 0; names[NR]; FS=","; }
{
#for each element in names array, check if already exists.
for(j=0;j<=i;j++)
{
if(names[j] == $5)
{
state = 1;
break;
}
}
# If the name doesnt already exist add to names array
if(state == 0)
{
names[i] = $5;
i++;
}
state = 0;
}
END {
for(x=0;x<=i;x++)
{
print names[x];
}
}
Run Code Online (Sandbox Code Playgroud)
有希望对位置进行排序并删除重复项,但我仍然想不出一个好方法来计算每个位置的实例然后将它们列回.
更简单的洗液:
awk -F '"' 'NR>3 {locname[$2]++}
END { for (n in locname) {print n, locname[n] } }' INPUTFILE
Run Code Online (Sandbox Code Playgroud)
首先将输入文件分隔符设置为"
,因此第二个字段将是位置名称.跳过第一行(标题).利用数组(键是第二个字段)来计算出现次数.在最后一行之后打印数组的键和值.