我有一个包含225000行的文件,其中包含一堆相似的行。我希望删除所有类似的行,而只保留每个“类型”的第一行。示例如下。
我想要一个看起来像这样的文件:
./ACT_HERE_REPORT_MEMO_APPROVED_20180510_083000.log.gz
./ACT_HERE_REPORT_MEMO_APPROVED_20180512_083000.log.gz
./ACT_HERE_REPORT_MEMO_APPROVED_20180513_083000.log.gz
./ACT_HERE_REPORT_MEMO_APPROVED_20180515_083000.log.gz
./ACT_HERE_SOMETHING_MEMO_APPROVED_20180326.xls
./ACT_HERE_SOMETHING_MEMO_APPROVED_20180327.xls
./ACT_HERE_SOMETHING_MEMO_APPROVED_20180328.xls
./ACT_HERE_SOMETHING_MEMO_APPROVED_20180329.xls
./ACT_HERE_SOMETHING_MEMO_APPROVED_20180331.xls
./Archive/20150919-084501.SOMETHING
./Archive/20150922-084501.SOMETHING
./Archive/20150923-084500.SOMETHING
./Archive/20150924-084500.SOMETHING
./TEST/TEST.20170310.20170310-181017.txt.gz
./TEST/TEST.20170310.20170310-201023.txt.gz
./TEST/TEST.20170313.20170313-011035.txt.gz
./TEST/TEST.20170313.20170313-024006.txt.gz
./TEST/TEST.20170313.20170313-041018.txt.gz
./TEST/TEST.20180402-011024.log.gz
./TEST/TEST.20180402-011200.log.gz
./TEST/TEST.20180402-061113.log.gz
./TEST/TEST.20180402-081013.log.gz
./TEST/TEST.20180402-101012.log.gz
Run Code Online (Sandbox Code Playgroud)
要这样结束:
./ACT_HERE_REPORT_MEMO_APPROVED_20180510_083000.log.gz
./ACT_HERE_SOMETHING_MEMO_APPROVED_20180326.xls
./Archive/20150919-084501.SOMETHING
./TEST/TEST.20170310.20170310-181017.txt.gz
./TEST/TEST.20180402-011024.log.gz
Run Code Online (Sandbox Code Playgroud) notepad++ ×1