I have tried many different combinations in order to get this to function the results vary from "no results" to simply producing various errors. What I am trying to do is search for all specified links on a web page containing part words or numbers... For example this works :
$nodes = $xpath->query('//a[contains(@href, \'sweet\')]/@href');
And searches for all hrefs that contain "sweet" in any part of the href... problem it's case sensitive and almost all of the URL's contain PHP query strings with usernames that allow for mixed upper and lower case in the string, so this is one of my many failed attempts and making the query case insensitive :
$nodes = $xpath->query('//a[contains(translate(\'ABCDEFGHIJKLMNOPQRSTUVWXYZ\',\'abcdefghijklmnopqrstuvwxyz\'),\'@href\', \'sweet\')]/@href');
I think I am on the right track but have the syntax wrong ?
Please, try
$nodes = $xpath->query('//a[contains(translate(@href,
\'ABCDEFGHIJKLMNOPQRSTUVWXYZ\',
\'abcdefghijklmnopqrstuvwxyz\'
),
\'sweet\'
)
]/@href');
instead.
Using fn:contains
with fn:translate
is the wrong approach, which is evidenced through how complicated it is for you to achieve this simple task.
If you have XPath 2.0, you could instead use fn:matches
, for example:
$nodes = $xpath->query("//a[matches(@href, 'sweet', 'i')]/@href");
Note that the 3rd argument to fn:matches is some flags to control the evaluation of the expression, in this case we have specified i
which means the comparison is case-insensitive. Arguably your query could also be simplified to:
$nodes = $xpath->query("//a/@href[matches(., 'sweet', 'i')]");
If you are stuck on XPath 1.0, then you could simply use an or
expression with two fn:contains
expressions, for example:
$nodes = $xpath->query("//a/@href[contains(., 'sweet') or contains(., 'SWEET')]");
Also in XQuery you can use either single or double quotes, so to make your code more readable I have used the single quotes, so that you do not need to escape the double quotes in the XQuery from your PHP code.