从全名中获取名字并使用 awk 或 sed 将其提取到新列

xor*_*eed 5 sed awk text-processing csv

我有很多.csv包含客户信息的文件。FIRSTNAME在所有这些文件中,我想在该列旁边添加一个附加列FULLNAME。名字可以通过抓取第一个单词来生成FULLNAME

没有像让·保罗这样只有两个字的名字。在最后一列中,字段文本中使用了逗号

输入

COMPANY,FULLNAME,EMAIL,FUNCTION,CITY,INDUSTRY,COMMENT
Company name,Firstname Lastname,firstname.lastname@example.com,Marketing Manager,New York,Health Care,"home, work"
Company name,Firstname infix Lastname,firstname.lastname@example.com,Marketing Manager,New York,Health Care,"home, workhome, work"
Company name,Firstname infix infix2 Lastname,firstname.lastname@example.com,Marketing Manager,New York,Health Care,"home, work"
Run Code Online (Sandbox Code Playgroud)

预期产出

COMPANY,FULLNAME,FIRSTNAME,EMAIL,FUNCTION,CITY,INDUSTRY,COMMENT
Company name,Firstname Lastname,Firstname,firstname.lastname@example.com,Marketing Manager,New York,Health Care,"home, work"
Company name,Firstname infix Lastname,Firstname,firstname.infix.lastname@example.com,Marketing Manager,New York,Health Care,"home, work"
Company name,Firstname infix infix2 Lastname,Firstname,firstname.infix12.lastname@example.com,Marketing Manager,New York,Health Care,"home, work"
Run Code Online (Sandbox Code Playgroud)

如何使用 awk、sed 或其他东西来做到这一点?

Kus*_*nda 8

使用支持 CSV 的实用程序Miller ( mlr):

mlr --csv \
    put '$FIRSTNAME = sub($FULLNAME," .*","")' then \
    reorder -f COMPANY,FULLNAME,FIRSTNAME file
Run Code Online (Sandbox Code Playgroud)

...鉴于问题中的数据,结果是

COMPANY,FULLNAME,FIRSTNAME,EMAIL,FUNCTION,CITY,INDUSTRY,COMMENT
Company name,Firstname Lastname,Firstname,firstname.lastname@example.com,Marketing Manager,New York,Health Care,"home, work"
Company name,Firstname infix Lastname,Firstname,firstname.lastname@example.com,Marketing Manager,New York,Health Care,"home, workhome, work"
Company name,Firstname infix infix2 Lastname,Firstname,firstname.lastname@example.com,Marketing Manager,New York,Health Care,"home, work"
Run Code Online (Sandbox Code Playgroud)

Miller 的这种使用首先FIRSTNAME通过基于正则表达式的替换创建一个新字段 ,该替换会删除该FULLNAME字段中第一个空格字符之后的所有内容。

由于新字段最后呈现,因此这些字段将被重新排序,以确保前几个字段按此顺序为COMPANYFULLNAME、 和。FIRSTNAME其余字段保留其原始顺序。

您可以使用with 的函数来代替put表达式 using ,以空格分割字段的值并选出第一个生成的字符串:sub()putsplitnv()FIRSTNAME

COMPANY,FULLNAME,FIRSTNAME,EMAIL,FUNCTION,CITY,INDUSTRY,COMMENT
Company name,Firstname Lastname,Firstname,firstname.lastname@example.com,Marketing Manager,New York,Health Care,"home, work"
Company name,Firstname infix Lastname,Firstname,firstname.lastname@example.com,Marketing Manager,New York,Health Care,"home, workhome, work"
Company name,Firstname infix infix2 Lastname,Firstname,firstname.lastname@example.com,Marketing Manager,New York,Health Care,"home, work"
Run Code Online (Sandbox Code Playgroud)

为了更漂亮的输出:

mlr --csv \
    put '$FIRSTNAME = splitnv($FULLNAME," ")[1]' then \
    reorder -f COMPANY,FULLNAME,FIRSTNAME file
Run Code Online (Sandbox Code Playgroud)