使用名称解析电子邮件地址(FROM或TO) - 不一定符合rfc 2822

I have an email field that may be formatted in a few different ways.

  1. hello@world.com

  2. "hello world" <hello@world.com>

  3. hello world <hello@world.com>

I would like to capture both the hello world string (if it's there) and the email address (if it's there). I have a regular expression that almost works, but it doesn't quite.

sed -r  's/"?([^"]+)*"?\s<?([^>]+@[^>]+)>?/["\1","\2"]/' <<< 'Hello World <helloworld@gmail.com>'

Please help?

Update:

This should do what you want:

^(?:"?([^@"]+)"?\s)?<?([^>]+@[^>]+)>?$

This will store the first part, if there is one, into the first capturing group and the email address int o the second group.

The regex looks not quite right. Anyway, the "Backtrack limit was exhausted" error occurs during executing this regex (you can check it with the preg_last_error function) so you can increase backtrack limit to make it work:

 ini_set('pcre.backtrack_limit', 1000000);
 var_dump(preg_replace('~"?([^"]+)*"?\s<?([^>]+@[^>]+)>?~', '["$1","$2"]', 'hello@world.com'));

outputs:

 string(15) "hello@world.com"

Ruby(1.9+)

$ ruby -e 'p gets.scan(/"?([^"]+)*"?\s<?([^>]+@[^>]+)>?/)' <<< '"Hello World" <helloworld@gmail.com>'
[["Hello World", "helloworld@gmail.com"]]