bst*_*teo 0 regex awk gnu sed gawk
我有以下类型的数据:
3869|Jennifer Smith
10413 NE 71st Street
Vancouver, WA
98662
360-944-9578
jsmith@yahoo.com|1234567890123456|03-2013|123
--
3875|Joan L Doe
422 1/2 14th Ave E
Seattle, WA
98112
206-322-7666
jldoe@comcast.net|1234-1234-1234-1234|03-2013|123
--
3862|Dana Doe
24235 NE 7th Pl
Sammamish, WA
98074
425 868-2227
jsmith@hotmail.com|1234567890123456|03-2013|123
--
3890|John Smith
10470 SW 67th Ave
Tigard, OR
97223
5032205213
john.smith@gmail.com|1234567890123456|03-2013|123
Run Code Online (Sandbox Code Playgroud)
我需要将其转换为:
3869|Jennifer Smith|10413 NE 71st Street|Vancouver, WA|98662|360-944-9578|jsmith@yahoo.com|1234567890123456|03-2013|123
3875|Joan L Doe|422 1/2 14th Ave E|Seattle, WA|98112|206-322-7666|jldoe@comcast.net|1234-1234-1234-1234|03-2013|123
3862|Dana Doe|24235 NE 7th Pl|Sammamish, WA|98074|425 868-2227|jsmith@hotmail.com|1234567890123456|03-2013|123
3890|John Smith|10470 SW 67th Ave|Tigard, OR|97223|5032205213|john.smith@gmail.com|1234567890123456|03-2013|123
Run Code Online (Sandbox Code Playgroud)
或更好:
3869|Jennifer Smith|10413 NE 71st Street|Vancouver|WA|98662|360-944-9578|jsmith@yahoo.com|1234567890123456|03-2013|123
3875|Joan L Doe|422 1/2 14th Ave E|Seattle|WA|98112|206-322-7666|jldoe@comcast.net|1234-1234-1234-1234|03-2013|123
3862|Dana Doe|24235 NE 7th Pl|Sammamish|WA|98074|425 868-2227|jsmith@hotmail.com|1234567890123456|03-2013|123
3890|John Smith|10470 SW 67th Ave|Tigard|OR|97223|5032205213|john.smith@gmail.com|1234567890123456|03-2013|123
Run Code Online (Sandbox Code Playgroud)
任何想法如何使用GNU sed,awk,cu或perl/python自动化...谢谢!
运用 sed
sed -n ':a;$!N;/--/!s/\n/|/g;ta;P' inputFile
$ sed -n ':a;$!N;/--/!s/\n/|/g;ta;P' temp
3869|Jennifer Smith|10413 NE 71st Street|Vancouver, WA|98662|360-944-9578|jsmith@yahoo.com|1234567890123456|03-2013|123
3875|Joan L Doe|422 1/2 14th Ave E|Seattle, WA|98112|206-322-7666|jldoe@comcast.net|1234-1234-1234-1234|03-2013|123
3862|Dana Doe|24235 NE 7th Pl|Sammamish, WA|98074|425 868-2227|jsmith@hotmail.com|1234567890123456|03-2013|123
3890|John Smith|10470 SW 67th Ave|Tigard, OR|97223|5032205213|john.smith@gmail.com|1234567890123456|03-2013|123
Run Code Online (Sandbox Code Playgroud)
:a 创建标签a. $!如果不是最后一行; 做N 得到一个新的路线/--/!如果行与此正则表达式不匹配; 做/s/\n/|/g 用管子代替新线ta 如果替换成功,则返回标签P 打印线. 注:这里的区别是p,P,n和N.
n命令将打印出当前模式空间并读入下一行输入. N命令不会打印出当前的模式空间.它读入下一行,但将新行字符与输入行本身一起附加到模式空间.p命令打印整个模式空间. P命令仅打印模式空间的第一部分,直到NEWLINE字符.