I have the following which replace all of å, ø, æ .... etc to just _.
$string = strtolower($string);
$regexp = '/( |å|ø|æ|Å|Ø|Æ|Ã¥|ø|æ|Ã…|Ø|Æ)/iU';
$replace_char = '_';
$data = preg_replace($regexp, $replace_char, $string);
Now I want to change them to according to the followings.
Replace,
space to _
å, Å, Ã¥ and Ã… to a,
ø, Ø, à and ظ to o,
æ, Æ, æ and Æ to e.
Can I use str_replace with array to do it? If yes, how to?
Or do I have to repeat the same regex three times?
Could anyone tell me better way to write the code?
--EDIT--
Pleas ignore the encoding at the moment. I am NOT asking advices about encoding now.
I asked the encoding problem here. Norwegian characters problem
I would use strtr
that you can pass a mapping:
$mapping = array(
'å' => 'a', 'Å' => 'a', 'Ã¥' => 'a', 'Ã…' => 'a',
'ø' => 'o', 'Ø' => 'o', 'Ã' => 'o', 'Ø' => 'o',
'æ' => 'e', 'Æ' => 'e', 'æ' => 'e', 'Æ' => 'e'
);
$str = strtr($str, $mapping);
But you should rather fix your encoding issue before. Because then you could use transliteration with iconv
:
$str = iconv("UTF-8", "ISO-8859-1//TRANSLIT", $str);
Like Gumbo said, you have some troubles with encoding, but leaving this fix to you the general idea would be
$data=preg_replace('/[ 帿ŨÆ]/iu','_',mb_strtolower($string,'utf-8'));
Note the mb_
variant of strtolower, in case you want to work with unicode.
Edit: And stakx's suggestion also makes sense, but it changes the logic.
An alternative solution utilizing mappings is to use str_replace
. I used a minimal set of your mappings for an example. Each value of $search
maps to the corresponding index in $replace
.
$search = array(' ', 'å', 'ø', 'æ', 'Å', 'Ø','Æ','Ã¥');
$replace = array('_', 'a', 'o', 'e', 'a', 'o', 'e', 'a');
$string = str_replace($search, $replace, mb_strtolower($string, 'utf-8');