Here is a sample PHP array that explains my question well
$array = array('1' => 'Cookie Monster (<i>eats cookies</i>)',
'2' => 'Tiger (eats meat)',
'3' => 'Muzzy (eats <u>clocks</u>)',
'4' => 'Cow (eats grass)');
All I need is to return only values that don't contain any tag enclosed with parentheses from this array:
- Tiger (eats meat)
- Cow (eats grass)
For this I'm going to use the following code:
$array_no_tags = preg_grep("/[A-Za-z]\s\(^((?!<(.*?)(\h*).*?>(.*?)<\/\1>).)*$\)/", $array);
foreach ($array_no_tags as $a_n_t) {echo "- ".$a_n_t."<br />";}
Assuming that [A-Za-z]
may be whoever, \s
is a space, \(
is the opening parenthesis, ^((?!
is start of the tag denial statement, <(.*?)(\h*).*?>(.*?)<\/\1>
is the tag itself, ).)*$
is end of the tag denial statement and \)
is the closing parenthesis.
Nothing works.
print_r($array_no_tags);
returns empty array.
You could use the following expression to match strings with HTML tags inside of parentheses:
/\([^)]*<(\w+)>[^<>]*<\/\\1>[^)]*\)/
Then set the PREG_GREP_INVERT
flag to true
in order to only return items that don't match.
$array_no_tags = preg_grep("/\([^)]*<(\w+)>[^<>]*<\/\\1>[^)]*\)/", $array, true);
Explanation:
\(
- Match the literal (
character[^)]*
- Negated character class to match zero or more non-)
characters<(\w+)>
- Capturing group one that matches the opening element's tag name[^<>]*
- Negated character class to match zero or more non-<>
characters<\/\1>
- Back reference to capturing group one to match the closing tag[^)]*
- Negated character class to match zero or more non-)
characters\)
- Match the literal )
characterIf you don't care about the parentheses around the element tag, then you could also just use the following simplified expression:
/<(\w+)>[^<>]+<\/\\1>/
And likewise, you would use:
$array_no_tags = preg_grep("/<(\w+)>[^<>]+<\/\\1>/", $array, true);
You pattern looks a bit overcomplicated. I thought maybe a simple pattern inside the negative lookahead that checks for not any <x
inside (
)
could be sufficient.
$array_no_tags = preg_grep("/^(?!.*?\([^)<]*<\w)/", $array);
So this does not match (?!
if there is an (
opening bracket, followed by [^)<]*
any amount of characters that are not )
or <
, followed by <\w
lesser sign that's followed by a word character.
Bear in mind that there are nice regex tools like regex101 available for testing patterns.