I have this php codes:
$main_url = "http://www.sports-reference.com/olympics/countries/DEN/summer/1896/";
$main_html=file_get_html($main_url);
$link = $main_html->getElementById('div_sports');
foreach ($link->find('td') as $element){
foreach($element->find('href') as $node){
echo $node->item(0)->nodeValue . "
";
//$link_clean = $node->getAttribute('href');
echo $link_clean . "
";
}
}
If I print out $element, I get this output:
<td align="left" ><a href="/olympics/countries/DEN/summer/1896/ATH/">Athletics</a></td>
<td align="left" ><a href="/olympics/countries/DEN/summer/1896/FEN/">Fencing</a></td>
<td align="left" ><a href="/olympics/countries/DEN/summer/1896/GYM/">Gymnastics</a></td>
<td align="left" ><a href="/olympics/countries/DEN/summer/1896/SHO/">Shooting</a></td>
<td align="left" ><a href="/olympics/countries/DEN/summer/1896/WLT/">Weightlifting</a></td>
I need to extract this info:
/olympics/countries/DEN/summer/1896/ATH/ /olympics/countries/DEN/summer/1896/FEN/ ..........
and so on. the code above is not working. CAn you helpme?
href
is not a tag, it is a tag attribute.
So, you have to search for <a>
:
foreach( $element->find('a') as $a)
{
echo $a->href . "
";
(...)
}