PHP Regex strtolower函数,用于维护引号内的文本大小写

I search a method to get a lower string but does not change the case of text within quotes.

The string :

SELECT * FROM UtilisateurApplicatif WHERE idUtilisateurApplicatif <> "-1" AND Identification = "TOTO" AND MotDePasse = "TotoTUTU" AND Actif = 1

The result why i want :

select * from utilisateurapplicatif where idutilisateurapplicatif <> "-1" and identification = "TOTO" and motdepasse = "TotoTUTU" and actif = 1

You can do this using preg_replace_callback that allows to apply a function on a match result:

$subject = <<<'LOD'
SELECT * FROM UtilisateurApplicatif
WHERE idUtilisateurApplicatif <> "-1"
AND Identification = "TOTO"
AND MotDePasse = "Toto\"TUTU" AND Actif = 1
LOD;

$pattern = <<<'LOD'
~
(?(DEFINE) 
    (?<DQuotedContent>
        (?> [^"\\]++ | (?:\\{2})++ | \\. )*
    )
)
" \g<DQuotedContent> " \K | [A-Z]++
~x
LOD;

$result = preg_replace_callback($pattern,
    function ($match) { return strtolower($match[0]); },
    $subject);
print_r($result);

Pattern explanation:

The idea of the pattern is to match quoted parts before and remove them from the match result to not apply the strtolower.

First I define a subpattern (DQuotedContent) with all the possible content between double quotes, ie:

  • all characters that are not a double quote or a backslash [^"\\]
  • all even number of backslashes (?:\\{2})++ (which can't escape anything)
  • escaped characters (an escaped double quote can't close a quoted string)

The main part of the pattern is now easy to write:

" \g<DQuotedContent> "      # quoted part
\K                          # reset all that have been matched before
|                           # OR
[A-Z]++                     # uppercase letters

Note that the \K is very useful since it remove the quoted part from the match. Thus the callback function don't have to know what have been matched to apply strtolower.

Notice: I have written the pattern using the nowdoc syntax, a define section, a named subpattern, and the comment mode (~x) for more readability, but you can use instead the same pattern in a more compact version:

$pattern = '~"(?>[^"\\\]++|(?:\\\{2})++|\\\.)*"\K|[A-Z]++~';

Unlike the nowdoc syntax, the backslash must be escaped twice.