I want to parse Google News RSS with PHP, to get actual links of the content.
Google News RSS item link looks like this:
http://news.google.com/news/url?sa=t&fd=R&usg=AFQjCNGkF58EwDE7aA742GfVP9aE8azmhg&url=http://www.reuters.com/article/2012/01/15/us-obama-mlk-idUSTRE80E0PD20120115
I need just the actual link, everything after &url= :
http://www.reuters.com/article/2012/01/15/us-obama-mlk-idUSTRE80E0PD20120115
And how would one go about eliminating the "non-essential" part of the URL, in essence targeting everything starting with http://news.google.com and ending with &url= ?
http://news.google.com/news/url?sa=t&fd=R&usg=AFQjCNGkF58EwDE7aA742GfVP9aE8azmhg&url=
I do a little regex, but this is out of my reach...
Thanks, fellas!
Regex is not necessarily the best approach here.
$query = parse_url($google_url, PHP_URL_QUERY);
parse_str($query, $parts);
$url = $parts['url'];
Here ya go:
$google_url = 'http://news.google.com/news/url?sa=t&fd=R&usg=AFQjCNGkF58EwDE7aA742GfVP9aE8azmhg&url=http://www.reuters.com/article/2012/01/15/us-obama-mlk-idUSTRE80E0PD20120115';
preg_match('/&url=([^&]+)/', $google_url, $matches);
$url = $matches[1];
echo $url;