I am developing an application using PHP that extracts all sentences from another webpage using CURL to do so. though i am able to extract all data, i am finding difficulties extracting a fully formed sentence. i have referred to all the related questions and dint help me exactly the way i wanted. Please advice
html content from where i need to extract a fully formed sentence
<p><font size="1" color="#C0C0C0">© Copyright <br></font><a href="http://www.dddddd.com" target="_blank"><font size="1" color="#C0C0C0">apple orange Ltd</font></a><font size="1"color="#C0C0C0"><a href="http://sm2.dddd.com/stats.asp?site=sm2ph0t0" target="_top"><img src="http://sm2.dddd.com/meter.asp?site=sm2ph0t0" alt="Site Meter" border=0></a></font></p></td><td valign="top" width="24"></td><!--msnavigation--><td valign="top"><p align="center"><a href="http://www.orangeapple.com" target="_blank"><img border="0" src="asddaf.jpg" alt="Sponsored by Ace Murder Mystery" width="85" height="121"></a><font face="Times New Roman"><b><b><u>Posters</u></b><br><font size="3" color="#008080">To find a large selection of jay joes prints and posters including framing options, please type the word..<font color="#996633"> asdasd </font></a><font color="#996633"> </font> in the box below:<br><b>
Basically if you notice there are lots of irrelavant sentences that might come out. i would want to extract a sentence from the above which has a minimum of "6" words in a string i should get "To find a large selection of jay joes prints and posters" as an output.
Thanks, Jay
I got this resolved using the following
$paras = $doc->getElementsByTagName('p');
for ($l = 0; $l < $paras->length; $l++)
{
$para = $paras->item($l);
$paraContent = $para->textContent;
$urlDet['para'] .= trim_text($paraContent, 1000);
}
Thanks to whoever tried to answer...