I'm having problems with regular expressions that I got from regexlib. I am trying to do a preg_replace() on a some text and want to replace/remove email addresses and URLs (http/https/ftp).
The code that I am have is:
$sanitiseRegex = array(
'email' => /'^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$/',
'http' => '/^(http|https|ftp)\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*$/',
);
$replace = array(
'xxxxx',
'xxxxx'
);
$sanitisedText = preg_replace($sanitiseRegex, $replace, $text);
However I am getting the following error: Unknown modifier '/' and $sanitisedText is null.
Can anyone see the problem with what I am doing or why the regex is failing?
Thanks
For a start, your email string is opened incorrectly:
'email' => /'^([a-zA-Z0-9_\-\.
// should be
'email' => '/^([a-zA-Z0-9_\-\.
The other problem is that you are using /
as a character to match and using it the start/end your URL regex, without escaping them in the regex. The simplest solution to simply use a different character to denote start/end of the regex, ie:
'http' => '@^(http|https|ftp)\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*$@'
What is happening is that it sees '^(http|https|ftp)\:'
as the regex, then starts looking for options. The first character after the 'end' of the regex is another '/'
which is an invalid option, hence the error message.
EDIT: Something quick that might fix the issue re: not matching. You could try the following instead:
'http' => '@^(http|https|ftp)\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?(/[a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~]*)?$@'