I have a search where visitors can search for an article's title. It's working great until they try to search for an utf-8 character.
Here's the problem: You search for "új" and you get results from title which contains "új", but not "Új".
In the database I have the uft-8 characters stored as html entities, so "új" looks like: "ú ;j" and "Új" looks like "Ú ;".
I have a function, which converts these utf8 characters to html entities. E.g.:
$text = str_replace("Ú","Ú",$text);
$text = str_replace("ú","ú",$text);
What I want to do is, if someone searches for ú also show results for Ú, but because they are different characters in the database (Ú ; and ú ;), therefore strtolower (or mb_strtolower) doesn't work, it only works for normal characters like E and e or U and u.
What should I do? Would it be better to store the utf-8 characters as utf-8 in the database?
Since you want this data for search optimization purposes what I would do is:
Add a slug field to the database, that contains the tittle as a lowercase non UTF8.
When searching conver the user input query to the same slug using that library
For more information you could take a look here you will need to write a migration script to convert all the already existing titles to slug strings and add that colum to the database, but that should be more or less trivial.