并行grep模式多个文件

Question

并行grep模式多个文件

mas*_*rah 6 parallel-processing bash grep gnu-parallel

我正在使用此命令成功搜索：从ips.txt日志目录（压缩文件）中的 txt 文件中搜索可疑 IP 列表。

root@yop# find /mylogs/ -exec zgrep -i -f ips.txt {} \; > ips.result.txt

Run Code Online (Sandbox Code Playgroud)

我现在想使用并行与它使用..以加快搜索速度。我目前无法找到正确的参数。我的意思是使用模式文件（每行一个）并将其导出到结果文件中。

请问有没有类似的大师？

我发现的更接近的命令是： grep-or-anything-else-many-files-with-multiprocessor-power

但是无法将它与模式文件列表一起使用并将结果导出到文件中......

请帮忙，谢谢大家。

Answer 1

Ste*_*eve 5

如果您只想一次运行多个作业，请考虑使用GNU parallel：

parallel zgrep -i -f ips.txt :::: <(find /mylogs -type f) > results.txt

Run Code Online (Sandbox Code Playgroud)

$并行-V | head -n 1 警告：您正在使用 --tollef。如果事情表现得很奇怪，请使用 --gnu。我要添加 --gnu 并测试 .. 好的，它可以使用：**parallel --gnu zgrep -i -f ips.txt :::: <(find /mylogs -type f) > results.txt* * (2认同)

Answer 2

Jos*_*lly 0

循环遍历文件，然后将每个文件放入后台作业怎么样？正如马克评论的那样，如果您有大量日志文件，这可能不适合。还假设您没有在后台运行任何其他内容。

mkdir results

for f in "$(find /mylogs/)"; do 
    (zgrep -i -f ips.txt "$f" >> results/"$f".result &); 
done

wait

cat results/* > ip.results.txt
rm -rf results

Run Code Online (Sandbox Code Playgroud)

您可以使用head和/或tail来限制要搜索的文件数量，例如仅搜索前 50 个文件：

for f in "$(find /mylogs/ | head -50)"; do...

Run Code Online (Sandbox Code Playgroud)

然后接下来的 50 个：

for f in "$(find /mylogs/ | head -100 | tail -50)"; do...

Run Code Online (Sandbox Code Playgroud)

等等。

归档时间：	11 年，8 月前
查看次数：	3726 次
最近记录：	11 年，8 月前