不能在php正则表达式中使用OR(|)

I'm a newbie here. I'm facing a weird problem in using regex in PHP.

$result = "some very long long string with different kind of links";

$regex='/<.*?href.*?="(.*?net.*?)"/'; //this is the regex rule

preg_match_all($regex,$result,$parts);

Here in this code I'm trying to get the links from the result string. But it will provide me only those links which contains .net. But I also want to get those links which have .com. For this I tried this code

    $regex='/<.*?href.*?="(.*?net|com.*?)"/';

But it shows nothing.

SOrry for my bad English.

Thanks in advance.

Update 1 :

now i'm using this

$regex='/<.*?href.*?="(.*?)"/';

this rule grab all the links from the string. But this is not perfect. Because it also grabs other substrings like "javascript".

Your regex gets interpreted as .*?net or com.*?. You'll want (.*?(net|com).*?).

The | character applies to everything within the capturing group, so (.*?net|com.*?) will match either .*?net or com.*?, I think what you want is (.*?(net|com).*?).

If you do not want the extra capturing group, you can use (.*?(?:net|com).*?).

You could also use (.*?net.*?|.*?com.*?), but this is not recommended because of the unnecessary repetition.

Try this:

$regex='/<.*?href.*?="(.*?\.(?:net|com)\b.*?)"/i';

or better:

$regex='/<a .*?href\s*+=\s*+"\K.*?\.(?:net|com)\b[^"]*+/i';
<.*?href

is a problem. This will match from the first < on the current line to the first href, regardless of whether they belong to the same tag.

Generally, it's unwise to try and parse HTML with regexes; if you absolutely insist on doing that, at least be a bit more specific (but still not perfect):

$regex='/<[^<>]*href[^<>=]*="(?:[^"]*(net|com)[^"]*)"/';