Extending DirWalker

Extending the DirWalker class to process files in the order you find them.

By Bob Ray  |  March 22, 2022  |  5 min read
Extending DirWalker

In the previous article, I introduced the DirWalker class, which recursively traverses directories and lets you process or display the files found. In this article, I’ll show you how easy it is to extend DirWalker in order to process the files as they are found, which is a little faster and uses less memory.

The downside of doing it this way is that you don’t have a chance to sort the files or process them in a particular order. Sometimes you don’t care, but if you’re producing a report and would like to show the file names in some order, it’s easier to just use the DirWalker class as is and call getFileArray() when the search is finished, then sort the list before generating the output.

Getting DirWalker

DirWalker is a single class file. You can see it at GitHub.

You can also install it in MODX through Package Manager (though the class does not require MODX) or get it at the MODX Repository. If you install the package, you’ll also get several files showing examples of how to use DirWalker to produce reports containing information gleaned from the MODX Codebase.

Extending the DirWalker Class

Extending a class means that you can add new class variables (properties) and new methods to the class, but still have all the methods of the parent class. The time to do this is when you don’t want to modify the parent class, but want to use it for a particular purpose.

We’ll use the simple example from the previous post, but extend the DirWalker class in order to process the files as they are found. This simple Snippet will report all class files that contain a call to $modx->invokeEvent().

The example recursively traverses the MODX core directory and all its descendants. It collects all class files that have .class in their filenames. It skips the cache, and packages directories. It excludes minimized and aggregated files and skips Git files and directories. The code assumes that you have installed the DirWalker package in Package Manager.

The example produces a report showing the names of all files that fire a MODX System Event. Since all files that do this call MODX’s invokeEvent() method, they’re easy to identify.

The Code

require MODX_CORE_PATH . 'components/dirwalker/model/dirwalker/dirwalker.class.php';

class MyDirWalker extends DirWalker {
    /* We add this class variable to hold the output */
    protected $output = '### Files containing invokeEvent';

    /* We override this method to process the files as found.
       This function is called by dirWalk() whenever it finds
       a file that fits the criteria. */
    protected function processFile($dir, $fileName) {
        /* Note that $dir is just the directory with no
           trailing slash so we have to create the full path*/
        $fullPath  = $dir . '/' . $fileName;

        /* get the file's content */
        $content = file_get_contents($fullPath);

        /* See if it contains a call to invokeEvent() */
        if (strpos($content, 'invokeEvent') !== false) {
            /* found one. Remove the first part of the path
               and add it to $this->output */
            $shortPath = str_replace(MODX_CORE_PATH, 'core/', $fullPath);
                $this->output .= "\n" . $shortPath;
        }
    }
    /* public function to get the output */
    public function getOutput() {
        return $this->output;
    }
    /* In case we want to run another search */
    public function resetOutput() {
        $this->output = '';
    }
}

$searchStart = MODX_CORE_PATH;
$output = '';
$dw = new MyDirWalker($modx);
$dw->setIncludes('.class');
$dw->setExcludes('-all,-min,.git');
$dw->setExcludeDirs('cache,.git,packages');
$dw->dirWalk($searchStart, true);

/* Get the output */
$output = $dw->getOutput();

/* Echo the output if running from the command line,
   otherwise, return it */

if (php_sapi_name() == 'cli') {
    echo $output;
} else {
    return $output;
}

We’ve extended the DirWalker class to modify what it does. Most of the methods are the same. The only one we’ve overridden is the processFile() method. Our processFile() method will be called internally by DirWalker’s dirWalk() method. We added a class variable $output to hold the output and a method to get its value. We could have used the existing $this->files class variable for the output and called $dw->getFiles() to get the result, but $this->files is an array and we just needed a string variable.

If you run this code (wrap the output in pre tags), you’ll see that many of the event invocations are in classes in the core/model/modx/processors directory (core/src/Revolution/Processors in MODX3), but not all of them. If you have Discuss installed, you’ll see that it fires some custom events of its own. Since we didn’t exclude the core/components directory (though we could have), the code above reports any events that are invoked by installed MODX add-on components in addition to those fired by MODX itself. If we searched the manager directory, we’d see some more events.

If we wanted to exclude the core/components directory, we’d just change the setExcludeDirs line to this:

$dw->setExcludeDirs('cache,.git,packages,components');

Summing Up

The example above is very simple, but it wouldn’t be too difficult to modify it to report the actual events that are invoked, and with a little preg_match_all() action in our processFile() method, we could even report the variables that are sent in each event invocation. That would be a really useful report.


Bob Ray is the author of the MODX: The Official Guide and dozens of MODX Extras including QuickEmail, NewsPublisher, SiteCheck, GoRevo, Personalize, EZfaq, MyComponent and many more. His website is Bob’s Guides. It not only includes a plethora of MODX tutorials but there are some really great bread recipes there, as well.