如何将hadoop中大文件的前几行复制到新文件？

Question

我在hdfs bigfile.txt中有一个大文件.我想将它的前100行复制到hdfs上的新文件中.我尝试了以下命令:

hadoop fs -cat /user/billk/bigfile.txt |head -100 /home/billk/sample.txt

它给了我一个"猫:无法写输出流"的错误.我在hadoop 1上.

还有其他方法吗？(注意:将第1行100行复制到本地或hdfs上的其他文件即可)

Answer 1

像这样 -

hadoop fs -cat /user/billk/bigfile.txt | head -100 | hadoop -put - /home/billk/sample.txt

我相信"cat:无法编写输出流"只是因为在读取其限制后头部关闭了流.看到这个关于head for hdfs的答案- /sf/answers/1384557191/