I have the following code
<?php
$doc = new DOMDocument;
$doc->loadhtml('<html>
<head>
<title>bar , this is an example</title>
</head>
<body>
<h1>latest news</h1>
foo <strong>bar</strong>
<i>foobar</i>
</body>
</html>');
$xpath = new DOMXPath($doc);
foreach($xpath->query('//*[contains(child::text(),"bar")]') as $e) {
echo $e->tagName, "
";
}
Prints
title
strong
i
this code finds any HTML element that contains the word "bar" and it matches words that has "bar" like "foobar" I want to change the query to match only the word "bar" without any prefix or postfix
I think it can be solved by changing the query to search for every "bar" that has not got a letter after or before or has a space after or before
this code from a past question here by VolkerK
Thanks
You can use the following XPath Query
$xpath->query("//*[text()='bar']");
or
$xpath->query("//*[.='bar']");
Note using the "//" will slow things down, the bigger you XML file is.
If you are looking for just "bar" with XPath 1.0 then you'll have to use a combo of functions, there are no regular expressions in XPath 1.0.
$xpath->query("//*[
starts-with(., 'bar') or
contains(., ' bar ') or
('bar' = substring(.,string-length(.)-string-length('bar')+1))
]");
Basically this is saying locate strings that start-with
'bar' or contains
' bar ' (notice the spaces before and after) or ends-with
'bar' (notice that ends-with is an XPath 2.0 function, so I substituted code which emulates that function from a previous Stackoverflow Answer.)
if the contains ' bar ' is not enough, because you may have "one bar, over"
or "This bar. That bar."
where you may have other punctuation after the 'bar'
. You could try this contains
instead:
contains(translate(., '.,[]', ' '), ' bar ') or
That translates any '.,[]'
to a ' '
(single space)... so "one bar, over"
becomes "one bar over"
, thus would match " bar "
as expected.