i try to get the sourcecode of a external website to load them and work with this code. I need to work with the content of some div's - named by a class or a specific name.
At first i get the sourcecode in this way
$url='http://www.example.com/site.html';
$page = file_get_contents($url);
Now i have to search the $page for some divs, for example search for a div with name="test1" or class="test2", also i have to look for some other elements like with specific names or classes.
Now i can use str_replace, explore etc. to built a long an inefectiv way to do that - mayme someone can tell me how i can do this in a simple and faster way? Maybe i can load the sourcecode in a kind of array or something else?
thanks a lot
For me only file_get_contents works - file_get_html wont work!?
A very quick, basic example of how you might use DOMDocument
and DOMXPath
to find elements within a page. You will want to read the manual I suspect for DOMDocument
and DOMXPath
and probably find a good XPath
cheatsheet ~ such as this
$url='http://www.example.com/site.html';
$dom=new DOMDocument;
$dom->loadHTMLFile( $url );
$xp=new DOMXPath( $dom );
$query='//div[ contains( @class,"test" ) ]';
$col=$xp->query( $query );
if( $col && $col->length>0 ){
foreach($col as $node)echo $node->nodeValue;
}