将2行合并为一行

sha*_*nuo 5 perl awk grep sed

我有一个文本文件以9位大学代码开头,以5位数的课程代码结束.

512161000 EN5121 K. K. Jorge Institute of Engineering Education and Research, Nashik 61220 Mechanical Engineering [Second Shift] XOPENH 1 116 16978
517261123 EN5172 R. C. Rustom Institute of Technology, Shirpur 61220 Mechanical Engineering [Second Shift] YOPENH 1 100 29555
617561234 EN6175 abc xyz Education Trust, abc xyz College of Engineering,
Pune 61220 Mechanical Engineering [Second Shift] ZOPENH 2 105 25017
Run Code Online (Sandbox Code Playgroud)

有一些条目有一个换行符,如上面的3例所示.我需要将第3行和第4行合并为一行,就像第一行和第二行一样,这样我就可以轻松使用grep,awk等命令.

更新:

凯文的答案似乎没有用.

cat todel.txt
112724510 EN1127 Jagadambha Bahuuddeshiya Gramin Vikas Sanstha's Jagdambha College of,
Engineering and Technology, Yavatmal 24510 Computer Engineering LSCO 1 55 93531

cat todel.txt | perl -ne 'chomp; if (/^\d{9}/) { print "\n$_" } else { print "$_\n" }' 
Engineering and Technology, Yavatmal 24510 Computer Engineering LSCO 1 55 93531ege of,
Run Code Online (Sandbox Code Playgroud)

Pet*_*r.O 1

关于分割行:此sed脚本假设在前导数字之后(在分割的第一行)至少有一个空格,在尾随数字之前(在分割的最后一行)至少有一个空格,并且只有每条分割线分割一次。

修改为接受 Windows CRLF 换行符*nix LF 的输入。但请注意,输出是 *nix \n

sed -nr 's/\r?$// # allow for '\r\n' newlines
         /^([0-9]{9}) .* ([0-9]{5})$/{p;b}
         /^([0-9]{9}) /{h;b}
         / ([0-9]{5})$/{x;G; s/\n//; p}' 
Run Code Online (Sandbox Code Playgroud)

或者,更短,但可读性可能较差:

sed -nr 's/\r?$//; /^([0-9]{9}) /{/ ([0-9]{5})$/{p;b};h;b};/ ([0-9]{5})$/{x;G; s/\n//; p}' 
Run Code Online (Sandbox Code Playgroud)

我确实希望第一个脚本更快,因为最频繁的测试(对于整行)仅涉及一个正则表达式,而第二个(较短的)脚本需要两个正则表达式测试来进行最频繁的测试。

这是我得到的输出;使用GNU sed 4.2.1

512161000 EN5121 K. K. Jorge Institute of Engineering Education and Research, Nashik 61220 Mechanical Engineering [Second Shift] XOPENH 1 116 16978
517261123 EN5172 R. C. Rustom Institute of Technology, Shirpur 61220 Mechanical Engineering [Second Shift] YOPENH 1 100 29555
617561234 EN6175 abc xyz Education Trust, abc xyz College of Engineering,Pune 61220 enter code hereMechanical Engineering [Second Shift] ZOPENH 2 105 25017
112724510 EN1127 Jagadambha Bahuuddeshiya Gramin Vikas Sanstha's Jagdambha College of,Engineering and Technology, Yavatmal 24510 Computer Engineering LSCO 1 55 93531
Run Code Online (Sandbox Code Playgroud)