PHP解析textarea输入的域名（用空格，逗号，换行符分隔）

For my users I need to present a screen where they can input multiple domain names in a textarea. The users can put the domain names on different lines, or separate them by spaces or commas (maybe even semicolons - I dont know!)

I need to parse and identify the individual domain names with extension (which will be .com, anything else can be ignored).

User input can be as:

asdf.com

qwer.com

AND/OR

wqer.com, gwew.com

AND/OR

ertert.com gdfgdf.com

No one will input a 3 level domain like www.abczone.com, but if they do I'm only interested in extracting the abczone.com part. (I can have a separate regex to verify/extract that from each).

This will do it:

(\b[a-zA-Z][a-zA-Z0-9-]*)(?=\.com\b)

"Find all sequences of a letter followed by letters, digits, or hyphens, followed by .com then a word break."

(You need the last bit to protect against picking up bim.com from bim.command.com.)

Python test case because I don't have a PHP test environment to hand:

DATA = "asdf.com
x-123.com, gwew.com bim.command.com 123.com, x_x.com"
import re
print re.findall(r'(\b[a-zA-Z][a-zA-Z0-9-]*)(?=\.com\b)', DATA)
# Prints ['asdf', 'x-123', 'gwew', 'command']

Here it is, you can use the i modifier and delete all the uppercase A-Z if you want to:

\b([a-zA-Z][0-9a-zA-Z\-]{1,62})\.com\b