需要拆分大型csv文件

Question

需要拆分大型csv文件

我有这个需要拆分成较小文件的 csv 文件。split -l 20000 test.csv 没问题我的问题是该文件包含不同的标题。想要拆分说每+-1000行，但它需要在pay header之后拆分，并且新文件需要以cust header开头。

客户头，xxx，xxxxxx，xxxxxx
txn 标头,xxxx,xxx,,xxxx,xxxxx,,xxx
详细标题,xxxx,xxxx,xxxxxx,xxxx,xxxx
详细标题,xxxxxxxx,xxxxxxxxxxxx,xxx,,
支付抬头,,,,,xxxxx,xxxxx
客户头，xxx，xxxxxx，xxxxxx
txn 标头,xxxx,xxx,,xxxx,xxxxx,,xxx
详细标题,xxxx,xxxx,xxxxxx,xxxx,xxxx
支付抬头,,,,,xxxxx,xxxxx
客户头，xxx，xxxxxx，xxxxxx
txn 标头,xxxx,xxx,,xxxx,xxxxx,,xxx
详细标题,xxxx,xxxx,xxxxxx,xxxx,xxxx
支付抬头,,,,,xxxxx,xxxxx

Answer 1

ilk*_*chu 6

你可以做这样的事情awk：

awk -vfilename=output -vcut=1000  '
    BEGIN { nl=0; nf=1; f=filename "." nf;} 
    ++nl >= cut && /^cust header,/ {
         close(f); nl=0; f=filename "." ++nf}
   {print > f}' < file

Run Code Online (Sandbox Code Playgroud)

它会记录看到的行数，如果计数大于cut（此处为1000），则重新打开一个新的输出文件，并且当前行以cust header,. 输出文件被命名为output.1, output.2, ... （filename变量）

归档时间：	7 年，5 月前
查看次数：	1818 次
最近记录：	7 年，5 月前