从XML站点地图获取所有链接,并将它们放入数组中?

I have a sitemap with many urls. Something like:

<url>
<loc>
http://site.com/
</loc>
<priority>
0.50
</priority>
<changefreq>
daily
</changefreq>
<lastmod>
2011-07-27T06:58:53+00:00
</lastmod>
</url>
<url>
<loc>
http://site.com/link

etc etc....

I need to get all the links in the sitemap, nothing else.

I've tried:

$links = file('sitemap.xml', FILE_IGNORE_NEW_LINES);

foreach($links as $link) {
    echo $link;
}

Now that echos all the links and leaves all the <loc>, <priority> etc etc out but it still includes the change frequency, lastmod etc etc....

So the output looks like this:

http://site.com/ 11 0.50 12 daily 13 2011-07-27T06:58:53+00:00 14  15  16 http://site.com/page.html 17 0.40 18 daily 19 2011-07-

and so on....

I need to just get the links and put the into an array. Any ideas?

Thank you.

EDIT:

Here is the code I'm using:

$urls = array();  
$xml='sitemap.xml';
$DomDocument = new DOMDocument();
$DomDocument->preserveWhiteSpace = false;
$DomDocument->loadXML("$xml"); // $DOMDocument->load('filename.xml');
$DomNodeList = $DomDocument->getElementsByTagName('from');

foreach($DomNodeList as $url) {
    $urls[] = $url->nodeValue;
}

//display it
echo "<pre>";
print_r($urls);
echo "</pre>";

Which returns the error: Warning: DOMDocument::loadXML() [domdocument.loadxml]: Start tag expected, '<' not found in Entity, line: 1

So i tried to test if it can even load the xml: I changed the xml file name to an invalid one ($xml='sit___emap.xml';)

I should of got an error saying it couldn't open the file, but instead it came up with the same error as before, with the correct filename set. So i don't think its the sitemap.

I couldn't get @AndreyKnupp's example to work. Here's what works for me:

$urls = array();  

$DomDocument = new DOMDocument();
$DomDocument->preserveWhiteSpace = false;
$DomDocument->load('filename.xml');
$DomNodeList = $DomDocument->getElementsByTagName('loc');

foreach($DomNodeList as $url) {
    $urls[] = $url->nodeValue;
}

//display it
echo "<pre>";
print_r($urls);
echo "</pre>";

Use any XML parser? DOMDocument, SimpleXML, xml_parse

You can do this ..

<?php
$urls = array();  

$DOMDocument = new DOMDocument();
$DOMDocument->preserveWhiteSpace = false;
$DOMDocument->loadXML($xml); // $DOMDocument->load('filename.xml');
$XPath = new DOMXPath($DOMDocument); // you can use getElementsByTagName

foreach($XPath->query('//url/loc') as $url) {
    // $urls[$url->nodeName] = $url->nodeValue;
    $urls[] = $url->nodeValue;
}

print_r($urls);

The output like:

Array
(
     [0] => http://site.com/
)

Could also use simplexml

$xml=simplexml_load_file($file);
$links=$xml->xpath('//url/loc');
print_r($links);

Edit: may need to use strval when you use these array elements as it is still considered a SimpleXML object.

the easiest way is

$strXml = @file_get_contents($url);
if (false == $strXml)
    die('Could not open url. Check your spelling and try again');
$txt ="";
// So simple using SimpleXml
$sitemap = @new SimpleXmlElement($strXml);
foreach($sitemap->url as $url) {
    $txt .= $url->loc . "
";
}

</div>

I have checked the speed execution time using Levi Morrison (DOMDocument) method vs taoufiqaitali method (SimpleXML). The results where so amazing that I must share this with you. My sitemap.xml had 11140 links in it (the sitemap of my webgallery).

Method 1 - DOMDocument

$start = microtime(true); // define a variable for checking execution time
$urls = array();  
$DomDocument = new DOMDocument();
$DomDocument->preserveWhiteSpace = false;
$DomDocument->load('sitemap.xml');
$DomNodeList = $DomDocument->getElementsByTagName('loc');
foreach($DomNodeList as $url) {
    $urls[] = $url->nodeValue;
}
echo "<pre>";
print_r($urls);
echo "</pre>";
$time_elapsed_secs = microtime(true) - $start;
echo $time_elapsed_secs . " seconds of execution time"; // show the execution time in seconds

Showed an 50.7 seconds execution time

Method 2 - SimpleXML

$start = microtime(true); // define a variable for checking execution time
$urls = array();
$strXml = @file_get_contents('sitemap.xml');
$sitemap = @new SimpleXmlElement($strXml);
foreach($sitemap->url as $url) {
    $urls[] = strval($url->loc);
}
echo "<pre>";
print_r($urls);
echo "</pre>";
$time_elapsed_secs = microtime(true) - $start;
echo $time_elapsed_secs . " seconds of execution time"; // show the execution time in seconds

Showed an 0.129 seconds execution time

That is a HUGE difference. The SimpleXML method is almost 400 times faster.