I have an XML structure in which certain elements have been marked by attributes, like so:
<a>
<b1>
<c1 special="yes" />
</b2>
<b2>
<c2 />
</b2>
</a>
I would like to locate the paths (or "breadcrumbs") for all elements matched by the attributes. In the above example:
//*[@special="yes"]
Result:
/a/b1/c1
I don't care about the values at all, just the list of paths to all "special" elements would suffice.
Edit: forgot to mention that I am looking for a solution in PHP, as there probably is not solution provided by XPath mechanisms.
Thanks.
You can use the ancestor-axis to fetch this path.
This will return a path for the current element. More on this solution in my answer to a similar question where XPath 2.0 was fine. If you append //*[@special="yes"]/
, it will return all pathes for the "special" elements.
string-join(
(
'',
(
.//ancestor-or-self::*/name(),
concat("@", .//ancestor-or-self::attribute()/name())
)
),
'/'
)
You can remove all newlines if you prefer, but it's easier to understand when nicely wrapped.
Sadly, PHP does not support XPath 2.0 out-of-the-box and you will have to do the looping and concatenation stuff in PHP, but still can use the ancestor-axis.
Building upon @Rolando Isidoro solution, this will make the "main" loop of his code both more elegant and efficient (although the improvement is minor and probably only noticeable in very large documents with very deep structure):
foreach ($nodes as $node) {
$breadcrumbs[$nodeCount] = array();
// Returns all nodes on ancestor path in document order
foreach ($node->xpath('ancestor-or-self::*') as $axisStep) {
// So all we need to do is append the name at the end of the array
$breadcrumbs[$nodeCount][] = $axisStep->getName();
}
$nodeCount++;
}
I wrote this quick snippet with some additional nodes to your DOM example in order to display a solution with paths to multiple "special" elements as you mentioned.
<?php
$breadcrumbs = array();
$paths = array();
$dom = <<<DOM
<a>
<b1>
<c1 special="yes" />
</b1>
<b2>
<c2 />
<c3 special="yes" />
</b2>
<b3 />
<b4>
<c1 />
<c2 />
<c3 />
<c4 special="yes" />
</b4>
</a>
DOM;
$sxe = new SimpleXMLElement($dom);
$nodes = $sxe->xpath('//*[@special="yes"]');
$nodeCount = 0;
foreach ($nodes as $node) {
$breadcrumbs[$nodeCount] = array($node->getName());
while ($node = $node->xpath("parent::*")) {
if (!empty($node[0])) {
$node = $node[0];
array_unshift($breadcrumbs[$nodeCount], $node->getName());
} else {
break;
}
}
$nodeCount++;
}
foreach ($breadcrumbs as $breadcrumb) {
$paths[] = join('/', $breadcrumb);
}
print_r($paths);
Array
(
[0] => a/b1/c1
[1] => a/b2/c3
[2] => a/b4/c4
)
Final note: Depending on what you want to do with the paths a simpler solution might be worked out.
In xpath you can use the ancestor-or-self axe, for selecting the current node and all it's ancestors at once. E.g. the following xpath query
//c1[@special='yes']/ancestor-or-self::node()
will return you a nodelist of c1, b1 and a
You are probably looking for the ancestor-or-self
Xpath axis which allows you to get all ancestors of an element including itself. E.g. like you first specify the endpoint of your breadcrumb (the page or document this is on):
$document = $xml->xpath('//*[@special="yes"]')[0]; # <c1 special="yes"/>
You can get the breadcrumb of it with that xpath axis:
$parents = $document->xpath('ancestor-or-self::*'); # a > b1 > c1
A full usage-example (Demo):
<?php
/**
* Get “breadcrumbs” for elements matched by an Xpath expression (in PHP)
* @link http://stackoverflow.com/a/16749372/367456
*/
$buffer = <<<BUFFER
<a>
<b1>
<c1 special="yes" />
</b1>
<b2>
<c2 />
</b2>
</a>
BUFFER;
$xml = simplexml_load_string($buffer);
$document = $xml->xpath('//*[@special="yes"]')[0];
echo $document->asXML(), "
";
$parents = $document->xpath('ancestor-or-self::*');
$getName = function(SimpleXMLElement $element) {
return $element->getName();
};
echo implode(' > ', array_map($getName, $parents)), "
";
Output:
<c1 special="yes"/>
a > b1 > c1