In Arabic language there are some character the public people can may way for example character ا
can be written by one of these (أ,إ,ا)
and character ه
can be written by one of these (ه, ة)
and so on.
for more clarify: may user search the keyword ايمان
I will return all results like ايمان or أيمان or إيمان
, and if he search for إيمان
I will return also the same three words, and if he search for أيمان
I will return also the same three words.
I want if user search of one of them (أ,إ,ا)
return all words which contain any one of them
I found the answer:
first I replace all similar chars to one char in search keyword by php like this:
$keyword = str_replace('أ', 'ا', $keyword);
$keyword = str_replace('إ', 'ا', $keyword);
$keyword = str_replace('ى', 'ي', $keyword);
$keyword = str_replace('ة', 'ه', $keyword);
then I replace column value stored in database in where sentence like this:
->whereRaw(" (REPLACE(REPLACE(REPLACE(REPLACE(page_title_ar, 'ة', 'ه'), 'أ', 'ا'), 'إ', 'ا'), 'ى', 'ي') like '%" . $keyword . "%' OR "
. "REPLACE(REPLACE(REPLACE(REPLACE(page_summery_ar, 'ة', 'ه'), 'أ', 'ا'), 'إ', 'ا'), 'ى', 'ي') like '%" . $keyword . "%' OR "
. "REPLACE(REPLACE(REPLACE(REPLACE(page_content_ar, 'ة', 'ه'), 'أ', 'ا'), 'إ', 'ا'), 'ى', 'ي') like '%" . $keyword . "%' OR "
. "REPLACE(REPLACE(REPLACE(REPLACE(page_title_en, 'ة', 'ه'), 'أ', 'ا'), 'إ', 'ا'), 'ى', 'ي') like '%" . $keyword . "%' OR "
. "REPLACE(REPLACE(REPLACE(REPLACE(page_summery_en, 'ة', 'ه'), 'أ', 'ا'), 'إ', 'ا'), 'ى', 'ي') like '%" . $keyword . "%' OR "
. "REPLACE(REPLACE(REPLACE(REPLACE(page_content_en, 'ة', 'ه'), 'أ', 'ا'), 'إ', 'ا'), 'ى', 'ي') like '%" . $keyword . "%'"
. " ) and deleted <> 1")
and the problem
try ORING
for example: SELECT * from TABLE WHERE col like 'app%' or col like 'cpp%' or.... or SELECT * from TABLE WHERE col like '%app%' or col like '%cpp%' or....
This post (MySQL diacritic insensitive search (Arabic)) covers searching Arabic-language text in a diacritic-insensitive way. It seems that when you use the utf8_unicode_ci collation, بسم and بِسْمِ are considered equal. But it is not so for the three words in your example.
I'm ignorant of Arabic, I am sorry. Is it possible this is a bug in the collation? Is it possible another collation is required for Arabic in your situation? In the meantime, you could compare substrings of your words if need be.