have a code like this:
<a href='www.link_not_required.com'>
<a href='www.link_not_required.com'>
<a href='www.link_1.com'><img src='image_1.png'></a>
<a href='www.link_2.com'><img src='image_2.png'></a>
<a href='www.link_3.com'><img src='image_3.png'></a>
<a href='www.link_4.com'><img src='image_4.png'></a>
<img src='image_not_required.png'>
<img src='image_not_required.png'>
I want to extract hrefs of only anchors which contain images and also src of those images ? I don't want links of anchors which do not contain images and also srcs of images which are not anchors.
How do I do this ? Can it be done using Simplehtmldom library?
I'm not sure why would you want to access contents of a HTML page using PHP which is a server side language. You could easily do this using JavaScript or jQuery.
However, lets say you read the contents of the HTML file/URL using some method (some of them can be file_get_contents, cURL, readfile etc.), and wish to use SimpleHTMLDom library. You could do below
Step #1 will give you all img tags, while step #2 will give you the corresponding parent anchor tags. You should be able to extract the required attributes.
All of this is available at http://simplehtmldom.sourceforge.net/manual.htm and I don't think Googling/reading through manual is that difficult.
It looks something like this:
require_once('simple_html_dom.php');
$str = <<<EOF
<a href='www.link_not_required.com'>
<a href='www.link_not_required.com'>
<a href='www.link_1.com'><img src='image_1.png'></a>
<a href='www.link_2.com'><img src='image_2.png'></a>
<a href='www.link_3.com'><img src='image_3.png'></a>
<a href='www.link_4.com'><img src='image_4.png'></a>
<img src='image_not_required.png'>
<img src='image_not_required.png'>
EOF;
$html = str_get_html($str);
foreach($html->find('a') as $a){
echo $a->href . ':' . $a->find('img',0)->src . "
";
}
Note that some a tags are not closed so the results will be mangled.