php抓取网页Meta Property=og标签内容

php抓取网页Meta Property=og标签内容
og是一种新的HTTP头部标记,即Open Graph Protocol:

The Open Graph Protocol enables any web page to become a rich object in a social graph.+ n3 }

即这种协议可以让网页成为一个“富媒体对象”。
用了Meta Property=og标签,就是你同意了网页内容可以被其他社会化网站引用等,目前这种协议被SNS网站如Fackbook、renren采用。

php抓取网页Meta Property=og标签内容代码:

function fatch_og($url){
if (!function_exists('curl_init')){
    die('CURL is not installed!');
}
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); // add this one, it seems to spawn redirect 301 header
curl_setopt($ch, CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13'); // spoof
$output = curl_exec($ch);
curl_close($ch);
$doc = new DOMDocument();
// squelch HTML5 errors
@$doc->loadHTML($output);
$xpath = new DOMXPath($doc);
$query = '//*/meta[starts-with(@property, \'og:\')]';
$metas = $xpath->query($query);
$rmetas = array();
foreach ($metas as $meta) {
    $property = $meta->getAttribute('property');
    $content = $meta->getAttribute('content');
    $rmetas[$property] = $content;
}
return $rmetas;
}
版权声明:kkvexl 发表于 9月 8, 2021 3:25 下午。
转载请注明:php抓取网页Meta Property=og标签内容 | WP之家

相关文章

暂无评论

暂无评论...