I have a script using simple_html_dom.
foreach ($urls as $url)
{
$html = file_get_html($url);
if($html->innertext!=''){
foreach($html->find('.doc div[style="padding-top:1px;border-bottom:1px solid #eaeaec;padding-bottom:6px;"]') as $b){
$b->style='""';
echo $b;
}
}
$html->clear();
unset($html);
}
When I running this script I get the following error:
Fatal error: Allowed memory size of 444596224 bytes exhausted (tried to allocate 1272 bytes).
if you still want to parse this file
you can set your php.ini's memory_limit
higher or use below code.
ini_set('memory_limit', '128M');
or optimize your code : (when found that, release it)
$finder = $html->find('.doc div[style="..."]'); foreach($finder as $index => $b){ /* do something here */ $finder[$index]->clear(); } $html->clear();
maybe you can use REGEX remove or get what you need
for fix problem need to replace this function,
function clear(){
$this->dom = null;
$this->parent = null;
$this->parent = null;
$this->children = null;
}
on this:
function clear(){
unset($this->dom);
unset($this->parent);
unset($this->parent);
unset($this->children);
}
Q: This script is leaking memory seriously... After it finished running, it's not cleaning up dom object properly from memory..
A: Due to php5 circular references memory leak, after creating DOM object, you must call $dom->clear() to free memory if call file_get_dom() more then once.
Example:
$html = file_get_html(...); // do something... $html->clear(); unset($html);
Taken from their FAQ found here: http://simplehtmldom.sourceforge.net/manual_faq.htm