如何在PHP中编写此爬虫？

Question

如何在PHP中编写此爬虫？

xRo*_*bot 2 php curl web-crawler html-parsing

我需要创建一个PHP脚本.

这个想法很简单:

当我将博客帖子的链接发送到此php脚本时,将抓取该网页,并将带有标题页的第一个图像保存在我的服务器上.

我必须为此爬虫使用什么PHP函数？

Answer 1

Nav*_*eed 6

使用PHP Simple HTML DOM Parser

// Create DOM from URL
$html = file_get_html('http://www.example.com/');

// Find all images
$images = array(); 
foreach($html->find('img') as $element) {
       $images[] = $element->src;
}

Run Code Online (Sandbox Code Playgroud)

现在$images数组有给定网页的图像链接.现在,您可以将所需的图像存储在数据库中.

归档时间：	15 年，4 月前
查看次数：	2123 次
最近记录：	13 年，6 月前