I have an email field that may be formatted in a few different ways.
hello@world.com
"hello world" <hello@world.com>
hello world <hello@world.com>
I would like to capture both the hello world string (if it's there) and the email address (if it's there). I have a regular expression that almost works, but it doesn't quite.
sed -r 's/"?([^"]+)*"?\s<?([^>]+@[^>]+)>?/["\1","\2"]/' <<< 'Hello World <helloworld@gmail.com>'
Please help?
Update:
This should do what you want:
^(?:"?([^@"]+)"?\s)?<?([^>]+@[^>]+)>?$
This will store the first part, if there is one, into the first capturing group and the email address int o the second group.
The regex looks not quite right. Anyway, the "Backtrack limit was exhausted" error occurs during executing this regex (you can check it with the preg_last_error function) so you can increase backtrack limit to make it work:
ini_set('pcre.backtrack_limit', 1000000);
var_dump(preg_replace('~"?([^"]+)*"?\s<?([^>]+@[^>]+)>?~', '["$1","$2"]', 'hello@world.com'));
outputs:
string(15) "hello@world.com"
Ruby(1.9+)
$ ruby -e 'p gets.scan(/"?([^"]+)*"?\s<?([^>]+@[^>]+)>?/)' <<< '"Hello World" <helloworld@gmail.com>'
[["Hello World", "helloworld@gmail.com"]]