正则表达式:查找所有旧的PHP开放标记

I'm trying to find and replace all old style PHP open tags: <? and <?=. I've tried several things:

Find all <? strings and replace them with <?php and ignore XML

sudo grep -ri "<?x[^m]" --include \*.php /var/www/

This returns no results, so all tags that open with <?x are XML opening tags and should be ignored.

Then I did the same for tags that start with <?p but are not <?php

sudo grep -ri "<?p[^h]" --include \*.php /var/www/

This returned one page that I edited manually - so this won't return results anymore. So I can be sure that tags that start with <?p all are <?php and the same goes for x and xml.

sudo grep -ri "<?[^xp]" --include \*.php /var/www/

Find more opening tags that should not be replaced

From here on I can run the above command and see what turns up: spaces, tabs, newlines, = and { (which can be ignored). I thought that \s would take care of whitespace, but I still get many results back.

Trying this results in endless lists with tabs in it:

sudo grep -ri "<?[^xp =}\t
\s]" --include \*.php /var/www/

So in the end this is not useful. I can't scan thousands of lines. What is wrong with this expression? If somewhere <?jsp would exist and shouldn't be replaced, I want to know this, exclude it, then get a shorter list back, and repeat this until the list is empty. That way I'm sure that I'm not going to change tags that shouldn't be changed.

Update: ^M

If I open the results in Vim, I see ^M, which is a newline character. This can be escaped pasting the following directly on the commandline where ^M is in the code below: Use Ctrl+V, Ctrl+M to enter a literal Carriage Return character into your grep string. This reduces the results to 1000 lines.

sudo grep -ri "<?[^xp =}\t
\s^M]" --include \*.php /var/www/

Replace the old tags

If this expression works, I want to run a sed command and use it to replace the old opening tags.

  • <? should become <?php (with ending space)
  • <?= should become <?php echo (with ending space)

This would result in one or more commands like these, first replacing <?, then <?=.

sudo find /var/www/ -type f -name "*.php" -exec sed -i 's/<?[^xp=]/<?php /g' {} \;
sudo find /var/www/ -type f -name "*.php" -exec sed -i 's/<?=/<?php echo /g' {} \;

Questions

  1. To get the search (grep) and replace (sed) working, I need to know how to exclude all whitespace. In Vim I see a ^M character which needs to be excluded.
  2. If my approach is wrong, please let me know. All suggestions are welcome.

I just did a small Perl test here with a few files... seems to work fine. Wouldn't this do the trick for you?

shopt -s globstar # turn on **
perl -p -e 's/<\?=/<php echo/g;s/<\?/<php/g' so-test/**/*.php
  • Change so-test for the folder you want to test on.
  • Add the -i.bak option before -e to create backup files.
  • Only add -i (without the .bak) to affect the files. Without -i, the result is printed to the console rather than written in files. Good for testing!