$string = '<td class="t_ip">85.185.244.101</td><td class="t_port"> <script type="text/javascript"> //<![CDATA[ document.write(HttpSocks^Xinemara^47225); //]]> </script> </td><td class="t_type"> 4 </td>';
$regex = "/<td class=\"t_ip\">\\s*((?:[0-9]{1,3}\\.){3}[0-9]{1,3})(?:.|\
)*<td class=\"t_port\">(?:.|\
)*\^([0-9]{1,5})(?:.|\
)*<td class=\"t_type\">\\s*([0-9])/";
preg_match($regex, $string, $matches);
$newString = $matches[1] . ':' . $matches[2] . ' ' . $matches[3];
print_r($newString);
Regular expression:
$regex = "/<td class=\"t_ip\">\\s*((?:[0-9]{1,3}\\.){3}[0-9]{1,3})(?:.|\
)*<td class=\"t_port\">(?:.|\
)*\^([0-9]{1,5})(?:.|\
)*<td class=\"t_type\">\\s*([0-9])/";
To extract the information in this way:
85.185.244.101:22088 4
But if repeated more than twice does not work
$string = '<td class="t_ip">85.185.244.101</td><td class="t_port"><script type="text/javascript"> //<![CDATA[document.write(HttpSocks^Xinemara^47225);//]]></script></td><td class="t_type">4</td><td class="t_ip">85.185.244.101</td><td class="t_port"><script type="text/javascript"> //<![CDATA[document.write(HttpSocks^Xinemara^47225);//]]></script></td><td class="t_type">4</td><td class="t_ip">85.185.244.101</td><td class="t_port"><script type="text/javascript"> //<![CDATA[document.write(HttpSocks^Xinemara^47225);//]]></script></td><td class="t_type">4</td>';
That would have to change to make it work?
$regex = "/<td class=\"t_ip\">\\s*?((?:[0-9]{1,3}\\.){3}[0-9]{1,3})(?:.|\
)*?<td class=\"t_port\">(?:.|\
)*?\^([0-9]{1,5})(?:.|\
)*?<td class=\"t_type\">\\s*?([0-9])/";
preg_match_all($regex, $string, $matches);
$data = array();
if ($matches) {
for ($i=0; $i<count($matches[0]); $i++) {
$data[] = $matches[1][$i] . ':' . $matches[2][$i] . ' ' . $matches[3][$i];
}
}
print_r($data);
I'd use a parser rather than a regex, regex with HTML don't go well. You could do something like this:
<?php
$string = '<td class="t_ip">85.185.244.101</td><td class="t_port"> <script type="text/javascript"> //<![CDATA[ document.write(HttpSocks^Xinemara^47225); //]]> </script> </td><td class="t_type"> 4 </td>';
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($string);
libxml_use_internal_errors(false);
$cells = $doc->getElementsByTagName('td');
foreach($cells as $cell) {
if(preg_match('/\bt_(ip|type)\b/', $cell->getAttribute('class'), $type)){
echo $type[1] . "=" . trim($cell->nodeValue) . "
";
}
}
Output:
ip=85.185.244.101
type=4
If you need to validate the IP you could add in something like:
if($type[1] == 'ip') {
if(filter_var($cell->nodeValue, FILTER_VALIDATE_IP)) {
echo 'valid ip' . $cell->nodeValue;
} else {
echo 'invalid ip' . $cell->nodeValue;
}
}
I don't see where in your provided string 22088
is coming from.