I need a little help with this one and my RegEx knowledge is a little lacking with this one.
I have a proxy list that I'm trying to parse and separate the IP and port number from the string.
The string being read looks like this.(example 1)
121.121.121.121:8081 2.103384 Китай high 05-07-2014 09:25:17
and sometimes look like (example 2)
222.222.222.222:8081
When I use this code.
preg_match_all('@[0-9]{1,4}\.[0-9]{1,4}\.[0-9]{1,4}\.@',$ip,$results);
$output = (preg_split('/:/',$results));
$ip = $output['0'];
$port = $output['1'];
It works great when there is just a IP:Port like in example #2 but in example #1 its also grabbing everything past the space so the port number looks like "8081 2.103384 Китай high 05-07-2014 09:25:17"
Is there a regex pattern I can use to have it stop at a space if it see's one?
With a split, you're only matching what you don't want; in this case you would want to have a match though.
The following matching expression should work in your case:
if (preg_match('/^(\d[\d.]+):(\d+)\b/', $proxy, $matches)) {
$ip = $matches[1];
$port = $matches[2];
}
This regex would match the ip-address and the port number,
\b[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}:[0-9]{1,5}\b
From that you could split it easily.
OR
you could use a preg_match
function,
<?php
$str = '121.121.121.121:8081 2.103384 Китай high 05-07-2014 09:25:17';
if (preg_match('~\b([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}):([0-9]{1,5}\b)~', $str, $matches)) {
$ip = $matches[1];
$port = $matches[2];
}
echo "$ip
";
echo "$port
";
?>
Output:
121.121.121.121
8081
As there is no need to validate IP addresses at this level, there's a shorter way to match them:
(\d+(?(?!:)\.)){4}:\d+
PHP:
preg_match_all('@(\d+(?(?!:)\.)){4}:\d+@', $ip, $results);