In mysql, I have a WHERE clause that uses LIKE to match a keyword to a string.
WHERE title LIKE CONCAT('%',?,'%')
This almost works, but I running into a problem where titles with multiple words are returning erroneous matches. For example, the keyword "press" would be matched to the title "depression sucks" - which should not happen.
What is the best way to ensure that the keyword matches only if it is at the start of a word?
Thanks (in advance) for your help
Word boundary markers [[:<:]]
, [[:>:]]
are your friend.
mysql> SELECT 'a word a' REGEXP '[[:<:]]word[[:>:]]'; -> 1
mysql> SELECT 'a xword a' REGEXP '[[:<:]]word[[:>:]]'; -> 0
WHERE title LIKE CAT(?, ' %')
the % is the wildcard character so if you have it in front and in back, it will look for anything that contains the ?. If you remove the first %, it will be anything that starts with the ?
Not familiar with mysql, but I have used REGEXP_LIKE in Oracle... mysql seems to support something similar...
http://dev.mysql.com/doc/refman/5.1/en/regexp.html
Then just write the appropriate regex.
WHERE title LIKE CONCAT('% ', ?, '%') OR title LIKE CONCAT(?, '%')
This way, the query will find the keyword if it's the beginning of a word or the beginning of a title. So, if the keyword is 'cat', it will find the movies:
catwoman
dogs and cats
but not a movie like:
concatenation
I think you're looking for MATCH
rather than LIKE
. With proper indexing, MATCH
is also much faster.
Here are the basics: http://dev.mysql.com/doc/refman/5.1/en/fulltext-search.html