小编zoo*_*mer的帖子

awk 日期条件

我正在尝试从 person.csv(如下)中删除行,条件是该人不是在过去 1 年出生的:

数据集1

"Index","User Id","First Name","Last Name","Date of birth","Job Title"
"1","9E39Bfc4fdcc44e","new, Diamond","Dudley","06 Dec 1945","Photographer"
"3","32C079F2Bad7e6F","Ethan","Hanson","08 Mar 2014","Actuary"
"2","aaaaaaa, bbbbbb","Grace","Huerta","21 Jan 2023","Visual merchandiser"
Run Code Online (Sandbox Code Playgroud)

因此,预期的输出如下所示(最后一行在不到一年的时间内被删除):

"Index","User Id","First Name","Last Name","Date of birth","Job Title"
"1","9E39Bfc4fdcc44e","new, Diamond","Dudley","06 Dec 1945","Photographer"
"3","32C079F2Bad7e6F","Ethan","Hanson","08 Mar 2014","Actuary"
Run Code Online (Sandbox Code Playgroud)

我尝试使用 awk 来执行以下操作:

awk -F , '{print $5 ....}' person.csv > output.csv
Run Code Online (Sandbox Code Playgroud)

但是,无法弄清楚如何将每行日期与(今天减去 1 年)进行比较。

Dataset2:有时双引号字段内可能有双引号,例如(line1 field4):

"Index","User Id","First Name","Last Name","Date of birth","Job Title"
"1","9E39Bfc4fdcc44e","new, Diamond","Dudley (aka "dud")","03 Oct 2023","Photographer"
"3","32C079F2Bad7e6F","Ethan","Hanson","03 Dec 2022","Actuary"
"2","aaaaaaa, bbbbbb","Grace","Huerta","21 Jan 2023","Visual merchandiser"
Run Code Online (Sandbox Code Playgroud)

如果“sed”可以做到这一点,我也持开放态度。请任何帮助,谢谢!

sed awk csv

1
推荐指数
1
解决办法
760
查看次数

标签 统计

awk ×1

csv ×1

sed ×1