相关疑难解决方法(0)

如何在PHP中解析和处理HTML/XML？

如何解析HTML/XML并从中提取信息？

php xml parsing html-parsing xml-parsing

Rob*_*itt

2019 04-15

2071
推荐指数

28
解决办法

40万
查看次数

刮取标题，描述和关键字的可靠方法

目前，我正在使用CURL抓取网站。我想可靠地获取标题，描述和关键字。

//Parse for the title, description and keywords
if (strlen($link_html) > 0)
{
    $tags = get_meta_tags($link);     // name
    $link_keywords = $tags['keywords'];     // php documentation
    $link_description = $tags['description'];
}

Run Code Online (Sandbox Code Playgroud)

唯一的问题是人们现在正在使用各种元标记，例如open graph <meta property="og:title" content="The Rock" />。它们也使标签变化很大<title> <Title> <TITLE> <tiTle>。要可靠地获得这些信息非常困难。

我确实需要一些可以一致地提取这些变量的代码。如果有标题，关键字和描述，则可以找到它。因为现在看来很受欢迎。

也许是一种将所有标题提取到titles数组中的方法？然后，抓取网站的开发人员可以选择最好的一个来记录在他们的数据库中。同样适用于关键字和说明。

这不是重复项。我已经搜索了stackoverflow，没有办法将所有“ title”，“ keywords”和“ description”类型标记放置到数组中。

php curl title

Amy*_*lle

2015 12-21

5
推荐指数

1
解决办法

958
查看次数

PHP get_meta_tags() 没有按我的预期工作

我想获取元标记（特别是 og:title、og:description 和 og:image）

我使用以下代码：

$tags = get_meta_tags('https://www.shoutmeloud.com/review-of-hostgator-webhosting-wordpress.html/');
echo "<pre>";
print_r($tags);

Run Code Online (Sandbox Code Playgroud)

它返回以下数组，

Array
(
    [viewport] => width=device-width, initial-scale=1
    [description] => Check out HostGator review for year 2017. This review if written after using Hostgator for 5 years. Is Hostgator Good hosting? Find out answer here
    [twitter:card] => summary_large_image
    [twitter:description] => Check out HostGator review for year 2017. This review if written after using Hostgator for 5 years. Is Hostgator Good hosting? Find out answer here
    [twitter:title] => A Blogger Review …

Run Code Online (Sandbox Code Playgroud)

php meta function get-meta-tags

Sas*_*234

2020 11-29

2
推荐指数

1
解决办法

4285
查看次数