从Python到PHP的GREP功能

I have a python script I wrote that I need to port to php. It recursively searches a given directory and builds a string based on regex searches. The first function I am trying to port is below. It takes a regex and a base dir, recursively searches all files in that dir for the regex, and builds a list of the string matches.

def grep(regex, base_dir):
    matches = list()
    for path, dirs, files in os.walk(base_dir):
        for filename in files:
            fullpath = os.path.join(path, filename)
            with open(fullpath, 'r') as f:
                content = f.read()
                matches = matches + re.findall(regex, content)
    return matches

I never use PHP except for basic GET param manipulation. I grabbed some directory walking code from the web, and am struggling to make it work like the python function above due to my utter lack of the php API.

function findFiles($dir = '.', $pattern = '/./'){
  $prefix = $dir . '/';
  $dir = dir($dir);
  while (false !== ($file = $dir->read())){
    if ($file === '.' || $file === '..') continue;
    $file = $prefix . $file;
    if (is_dir($file)) findFiles($file, $pattern);
    if (preg_match($pattern, $file)){
      echo $file . "
";
    }
  }
}

Here is my solution:

<?php 

class FileGrep {
    private $dirs;      // Scanned directories list
    private $files;     // Found files list
    private $matches;   // Matches list

    function __construct() {
        $this->dirs = array();
        $this->files = array();
        $this->matches = array();
    }

    function findFiles($path, $recursive = TRUE) {
        $this->dirs[] = realpath($path);
        foreach (scandir($path) as $file) {
            if (($file != '.') && ($file != '..')) {
                $fullname = realpath("{$path}/{$file}");
                if (is_dir($fullname) && !is_link($fullname) && $recursive) {
                    if (!in_array($fullname, $this->dirs)) {
                        $this->findFiles($fullname, $recursive);
                    }
                } else if (is_file($fullname)){
                    $this->files[] = $fullname;
                }
            }
        }
        return($this->files);
    }

    function searchFiles($pattern) {
        $this->matches = array();
        foreach ($this->files as $file) {
            if ($contents = file_get_contents($file)) {
                if (preg_match($pattern, $contents, $matches) > 0) {
                    //echo $file."
";
                    $this->matches = array_merge($this->matches, $matches);
                }
            }
        }
        return($this->matches);
    }
}


// Usage example:

$fg = new FileGrep();
$files = $fg->findFiles('.');               // List all the files in current directory and its subdirectories
$matches = $fg->searchFiles('/open/');      // Search for the "open" string in all those files

?>
<html>
    <body>
        <pre><?php print_r($matches) ?></pre>
    </body>
</html>

Be aware that:

  • It reads each file to search for the pattern, so it may require a lot of memory (check the "memory_limit" configuration in your PHP.INI file).
  • It does'nt work with unicode files. If you are working with unicode files you should use the "mb_ereg_match" function rather than the "preg_match" function.
  • It does'nt follow symbolic links

In conclusion, even if it's not the most efficient solution at all, it should work.