DOMXPath的构建有多贵?

When writing a parser of a complex XML document, I wonder if it is OK to construct DOMXPath instances when needed:

function parseData($d) {
    $xpath = new DOMXPath($d);
    // ...
}

function parseMoreData($d) {
    $xpath = new DOMXPath($d);
    // ...
}

$d = new DOMDocument();
$d->loadXML($xml);
parseData($d);
parseMoreData($d);

The alternative would be to create one DOMXPath instance in the beginning, then reuse it everywhere in the parser:

function parseData($d, $xpath) {
    // ...
}

function parseMoreData($d, $xpath) {
    // ...
}

$d = new DOMDocument();
$d->loadXML($xml);
$xpath = new DOMXPath($d);
parseData($d, $xpath);
parseMoreData($d, $xpath);

It is very okay to create the DOMXpath instances when you need them, there is not much overhead with these built-in classes in PHP. Especially with DOMDocument / DOMXpath, those merely wrap around features of libxml.

So what more plays a role how many (different) documents you've opened, not how many DOMXpath objects you have created.

Also instead of passing a DOMDocument into the parse function(s):

function parseData(DOMDocument $doc) 
{
    $xpath = new DOMXPath($doc);

    // ...
}

You can pass the xpath - it carries the document as well:

function parseData(DOMXpath $xpath) 
{
    $doc = $xpath->document;

    // ...
}

So your alternative example with two function parameters is technically not necessary at all. As you merely concerned about that detail, this last suggestion should solve your "issue".

And keep in mind that you normally only need to care about performance when you run into a real problem. Here the code merely improves by injecting the object to operate on instead of creating it inside the function. That's called dependency injection and a much better way to write code: Functions should ask (read: have a parameter) for what they need and not create it their own. They should concentrate on the job (here parsing the data) instead of taking care to instantiate a DOMXPath first.


The question of O(1) (Big O notation).

How expensive is new DOMXpath($doc)? Is it O(1)? In comments I answered this with a yes based on what I know from experience and understanding.

Now I also took a look into lxr. When a new DOMXPath is created (constructor in PHP-C-Code), it's merely a wrapper around a structure in libxml (the underlying C-Library of the DOMDocument extension): xmlXPathContext.

All this code looks pretty straight forward and only reading/setting single values (not misc size lists or so), so that I'd say now that yes, definitely creating a DOMXPath is O(1).

Your alternative is of course more efficient because it does not rebuild one domXpath. But what you have to keep in mind is that the only object you need here is DomXPath. As you can see, the constructor of DOMXpath only depends on the DOMDocument instance you give him. So if your functions take only DOMXpath as parameter it would be the same.

You can access to the xpath's document by using $xpath->document.

Then, it is like you want, the performance you win is not so significant, only 252 bytes.

About the time you will take, it will become significant only if you use big files as the parse time will be more significant than any other treatment and everytime you start a DOMXPath. A solution can be to use a factory pattern :

class XPathFactory{
    private static instances=array();
    public static function getXPath($doc,$namespacePrefix){
        if(!isset(self::$instances[spl_object_hash($doc).$namespacePrefix]){
           self::$instances[spl_object_hash($doc).$namespacePrefix] = new DOMXPath($doc);
           self::$instances[spl_object_hash($doc).$namespacePrefix]->registerNamespace($namespacePrefix);
         }
         return self::$instances[spl_object_hash($doc).$namespacePrefix];
    }
 }

then in your functions you just have to call :

 XPathFactory::getXPath($doc,$namespace);

and you will get the good XPAth without instanciating too much.