I made a function in php that returns the result of a POST and converts it to a string using cURL . However, my code returns a lot of useless information. I wanted to manipulate the string so that I could delete all the useless information and leave only what is useful, that is, delete everything that comes before the word </SCRIPT>
in string.
I'm putting a piece of string that I get (there is a lot more information before that). I wanted my string was equal to that in the interval between <dt><b><font color="maroon">como</font></b>
and </table>
.
<SCRIPT LANGUAGE='JavaScript1.1'><!--
var objForm = document.theform;
var index = 0;
objForm.text.value="como n\u00e3o amar uma pessoa t\u00e3o linda";
objForm.text.focus();
checkIt(objForm.parser, 'parse');
checkIt(objForm.visual, 'niceline');
function getIndex(elemID, testValue){
for(i=0; i<elemID.length; i++){
if (elemID[i].value == testValue)
return i;
}
return 0;
}
function checkIt(element, value) {
if (element.length==1 || element.type=="checkbox"){
element.checked=1;
element.selected=1;
}
else if (element.length>1){
index = getIndex(element, value);
element[index].selected=1;
element[index].checked=1;
}
}
//-->
</SCRIPT>
<dl>
**<dt><b><font color="maroon">como</font></b>**
<font color="maroon">[como]</font> <rel> <ks> <font color="blue"><b>ADV</b> </font> <font color="darkgreen">@ADVL></font> <font color="darkgreen"><b>@#FS-ADVL</font></b> <font color="darkgreen"><b>@#FS-N<</font></b>
<dt><b><font color="maroon">não</font></b>
<font color="maroon">[não]</font> <font color="blue"><b>ADV</b> </font> <font color="darkgreen">@ADVL></font>
<dt><b><font color="maroon">amar</font></b>
<font color="maroon">[amar]</font> <vt> <font color="blue"><b>V</b> FUT 1/3S SUBJ VFIN </font> <font color="darkgreen">@FMV</font>
<dt><b><font color="maroon">uma</font></b>
<font color="maroon">[um]</font> <arti> <font color="blue"><b>DET</b> F S </font> <font color="darkgreen">@>N</font>
<dt><b><font color="maroon">pessoa</font></b>
<font color="maroon">[pessoa]</font> <H> <font color="blue"><b>N</b> F S </font> <font color="darkgreen">@<ACC</font>
<dt><b><font color="maroon">tão</font></b>
<font color="maroon">[tão]</font> <dem> <quant> <font color="blue"><b>ADV</b> </font> <font color="darkgreen">@>A</font>
<dt><b><font color="maroon">linda</font></b>
<font color="maroon">[lindo]</font> <font color="blue"><b>ADJ</b> F S </font> <font color="darkgreen">@N<</font>
<dt><b><font color="maroon">.</font></b>
</dl>
</td>
<td>
</td>
</tr>
</table>
<br>
<div style="text-align: center;">
<!--
<hr width="60%">
<b style="color: #800; font-size: 75%;">With the most recent Java update, Oracle has decided to set the default Java security settings to block all unsigned applets.
<br/>Until we can fix this on our end by signing the applets, you can lower your security settings from Control Panel -> Java -> Security and set the slider to Medium instead of High.
<br/>If something else isn't working properly, contact <a href="mailto:mail@tinodidriksen.com">Tino Didriksen</a>.</b>
<br/>
-->
<hr width="60%">
<a href="/visl/about/">Copyright 1996-2015</a>
| <a href="/contact.html">Report a Problem / Contact Us</a>
<!-- | <a href="http://beta.visl.sdu.dk/visl/about/spgskema_en.html" title="Please fill out our survey on how you use the VISL site!">Visitor Questionnaire</a> -->
| <a href="/visl/pt/parsing/automatic/parse.php?print=1" rel="nofollow">Printable Version</a>
<!--[if IE]>
<br><br>
What function I use to be able to manipulate this string and delete these useless information ?
You could use strpos to find the position of where you want to split the string. Then use substr to split the string from that position.
Supposed you put the data from above in $content
, use
$from = preg_quote('<dt><b><font color="maroon">como</font></b>');
$to = preg_quote('</table>');
preg_match('/'.$from.'(.*?)'.$to.'/', $content, $matches);
Then $matches[1]
will contain the stuff you want to extract.