I've had a good look around for a question that asked this before; alas, my search for a PHP preg_match
search returned no results (maybe my searching skills fell short, I suppose justified considering it's a Regex question!).
Consider the text below:
The quick
__("brown ")
fox jumps__('over the')
lazy__("dog")
Now currently I need to 'scan' for the given method __('')
above, whereas it could include the spacing and different quotations ('
|"
). My best attempt after numerous 'iterations':
(__\("(.*?)"\))|(__\('(.*?)'\))
Or at its simplest form:
__\((.*?)\)
To break this down:
__
(
and quotation mark "
or '
. Thus, \(\"
(.*?)
Non-greedy match of all characters"
and last bracket.|
between the two expressions match either/or.However, this only gets partial matches, and spaces are throwing off the search entirely. Apologies if this has been asked before, please link me if so!
Tester Link for the pattern provided above:
When the searched method string uses single quotes it will end up in another capture group than if it has double quotes. So in fact, your regular expression works (except for the spaces, see further down), but you'd have to look at a different index in your result array:
$input = 'The quick __("brown ") fox jumps __(\'over the\') lazy __("dog")';
// using your regular expression:
$res = preg_match_all("/(__\(\"(.*?)\"\))|(__\('(.*?)'\))/", $input, $matches);
print_r ($matches);
Note that you need preg_match_all
instead of preg_match
to get all matches.
Output:
Array
(
[0] => Array
(
[0] => __("brown ")
[1] => __('over the')
[2] => __("dog")
)
[1] => Array
(
[0] => __("brown ")
[1] =>
[2] => __("dog")
)
[2] => Array
(
[0] => brown
[1] =>
[2] => dog
)
[3] => Array
(
[0] =>
[1] => __('over the')
[2] =>
)
[4] => Array
(
[0] =>
[1] => over the
[2] =>
)
)
So, the result array has 5 elements, the first one representing the complete match, and all the others correspond to the 4 capture groups you have in your regular expression. As the capture groups for single quotes are not those of the double quotes, you'll find the matches at different places.
To "solve" this, you could use a back reference in your regular expression, which would look back to see which was the opening quote (single or double) and require the same to be repeated at the end:
$res = preg_match_all("/__\(([\"'])(.*?)\\1\)/", $input, $matches);
Note the back reference \1
(the backslash had to be escaped with another one). This refers back to the first capture group, where we have ["']
(again an escape was necessary) to match both kinds of quotes.
You also wanted to deal with spaces. On your PHP Live Regex you used a test string that had such spaces between the brackets and quotes. To deal with these so they still match the method strings correctly, the regular expression should get two additional \s*
:
$res = preg_match_all("/__\(\s*([\"'])(.*?)\\1\s*\)/", $input, $matches);
Now the output is:
Array
(
[0] => Array
(
[0] => __("brown ")
[1] => __('over the')
[2] => __("dog")
)
[1] => Array
(
[0] => "
[1] => '
[2] => "
)
[2] => Array
(
[0] => brown
[1] => over the
[2] => dog
)
)
... and the text captured by the groups is now nicely arranged.
See this code run on eval.in and PHP Live Regex.
How about this:
(__(\('[^']+'\)|\("[^"]+"\)))
Instead of the non greedy .
, use any char but the quotes [^']
or [^"]
When working with stuff like this, don't forget about escaping:
<?php
ob_start();
?>
The quick __("brown ") fox jumps __( 'over the' ) lazy __("dog").
And __("everyone says \"hi\"").
<?php
$content = ob_get_clean();
$re = <<<RE
/__ \(
\s*
" ( (?: \\\\. | [^"])+ ) "
|
' ( (?: \\\\. | [^'])+ ) '
\s*
\)
/x
RE;
preg_match_all($re, $content, $matches, PREG_SET_ORDER);
foreach($matches as $match)
echo end($match), "
";
Enclose double and single quotes with square brackets as a character class:
$str = 'The quick __( "brown ") fox jumps __(\'over the\') lazy __("dog")';
preg_match_all("/__\(\s*([\"']).*?\\1\s*\)/ium", $str, $matches);
echo '<pre>';
var_dump($matches[0]);
// the output:
array (size=3)
0 => string '__( "brown ")'
1 => string '__('over the')'
2 => string '__("dog")'
And here is example with the same solution on phpliveregex.com:
http://www.phpliveregex.com/p/exF (section preg_match_all
)