Ram*_*min 4 php dom symfony guzzle domcrawler
我正在使用 guzzle POST 方法获取 URL。它正在工作并返回我想要的页面。但问题是,当我想获取该页面中表单中的输入元素的值时,爬虫什么也不返回。我不知道为什么。
PHP:
<?php
use Symfony\Component\DomCrawler\Crawler;
use Guzzle\Http\Client;
$client = new Client();
$request = $client->get("https://example.com");
$response = $request->send();
$getRequest = $response->getBody();
$cookie = $response->getHeader("Set-Cookie");
$request = $client->post('https://example.com/page_example.php', array(
'Content-Type' => 'application/x-www-form-urlencoded',
'Cookie' => $cookie
), array(
'param1' => 5,
'param2' => 10,
'param3' => 20
));
$response = $request->send();
$pageHTML = $response->getBody();
//fetch orderID
$crawler = new Crawler($pageHTML);
$orderID = $crawler->filter("input[name=orderId]")->attr('value');//there is only one element with this name
echo $orderID; //returns nothing
Run Code Online (Sandbox Code Playgroud)
我应该怎么办 ?
您不必创建爬网程序:
$crawler = $client->post('https://example.com/page_example.php', array(
'Content-Type' => 'application/x-www-form-urlencoded',
'Cookie' => $cookie
), array(
'param1' => 5,
'param2' => 10,
'param3' => 20
));
$orderID = $crawler->filter("input[name=orderId]")->attr('value');
Run Code Online (Sandbox Code Playgroud)
这假设您的 POST 没有被重定向,如果被重定向,您应该在调用过滤器函数之前添加:
$this->assertTrue($client->getResponse()->isRedirect());
$crawler = $client->followRedirect();
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
2002 次 |
| 最近记录: |