jix*_*ubi 2 awk text-processing csv-simple
我复制了部分 csv 文件。
publish_date,headline_text,likes_count,comments_count,shares_count,love_count,wow_count,haha_count,sad_count,thankful_count,angry_count
20030219,aba decides against community broadcasting licence,1106,118,109,155,6,5,2,0,6
20030219,act fire witnesses must be aware of defamation,137,362,67,0,0,0,0,0,0
20030219,a g calls for infrastructure protection summit,357,119,212,0,0,0,0,0,0
20030219,air nz staff in aust strike for pay rise,826,254,105,105,21,45,7,0,90
20030219,air nz strike to affect australian travellers,693,123,153,17,113,4,103,0,7
20030219,ambitious olsson wins triple jump,488,57,161,0,0,0,0,0,0
20030219,antic delighted with record breaking barca,386,60,80,3,4,0,93,0,68
20030219,aussie qualifier stosur wastes four memphis match,751,45,297,0,0,0,0,0,0
20030219,aust addresses un security council over iraq,3847,622,141,1,0,0,0,0,0
20030219,australia is locked into war timetable opp,1330,205,874,0,0,0,0,0,0
20030219,australia to contribute 10 million in aid to iraq,3530,130,0,23,16,4,1,0,0
20030219,barca take record as robson celebrates birthday in,13875,331,484,0,0,0,0,0,0
20030219,bathhouse plans move ahead,11202,450,2576,433,51,20,4,0,34
20030219,big hopes for launceston cycling championship,3988,445,955,0,0,0,0,0,0
20030219,big plan to boost paroo water supplies,460,101,92,0,0,0,0,0,0
20030219,blizzard buries united states in bills,303,223,193,0,0,0,0,0,0
Run Code Online (Sandbox Code Playgroud)
我想找到一个 shell 命令,它可以帮助我创建一个新列,将每个条目 (likes_count+ love_count +Thankful_count) - (angry_count + sad_count) 相加,并将该列命名为 Emotion_polarity。
我试过了
awk -F , {$12=$3+$6+$10-$11-$9;}{print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12} file
Run Code Online (Sandbox Code Playgroud)
但由于某种原因列混合在一起,它不起作用。我认为这可能是因为我在执行此操作时丢失了逗号
集OFS(Ø本安输出˚F ield小号eparator)也让你不会失去逗号。这样做时它会丢失逗号$12=$3+$6+$10-$11-$9,即设置/更新任何列的值,在这种情况下,awk会根据 OFS 内部变量(默认为空格字符)在当前行上进行字段拆分,因此将其设置为逗号将打印时保持输出。
awk 'BEGIN{ FS=OFS="," }
{ $(NF+1)=(NR==1? "emotional_polarity" : $3+$6+$10-$11-$9); print }' infile
Run Code Online (Sandbox Code Playgroud)
或者简单地将新更新附加到当前输入行:
awk -F, '{ $0=$0 FS (NR==1? "emotional_polarity" : $3+$6+$10-$11-$9); print }' infile
Run Code Online (Sandbox Code Playgroud)
从awk 手册:
FS
输入字段分隔符(请参阅指定如何分隔字段部分)。该值是匹配输入记录中字段之间的分隔的单字符串或多字符正则表达式。OFS
输出字段分隔符(参见部分输出分隔符)。它在打印语句打印的字段之间输出。它的默认值是“”,一个由一个空格组成的字符串。