I am trying to convert standard ASCII letters to their full-width Japanese equivalents. For example:
Game
becomes Game
I searched for an answer and I found this question with a good answer that I've quoted below:
$str = "Game some other text by ヴィックサ";
$str = preg_replace_callback(
"/[\x{ff01}-\x{ff5e}]/u",
function($c) {
// convert UTF-8 sequence to ordinal value
$code = ((ord($c[0][0])&0xf)<<12)|((ord($c[0][1])&0x3f)<<6)|(ord($c[0][2])&0x3f);
return chr($code-0xffe0);
},
$str);
But I wanted it in the opposite direction. I tried changing the (-) sign to (+) in the return statementm, but didnt have much success.
This is simple using PHP's mb_convert_kana
function. See http://php.net/manual/en/function.mb-convert-kana.php. You want at a minimum the R
mode to convert "han-kaku" alphabets to "zen-kaku".
"/[\x{ff01}-\x{ff5e}]/u" is for detecting if the letter is a full width. You have to find a half width letter first. So I changed to "/[\x{0021}-\x{007e}]/u". The unicode table is here http://jrgraphix.net/r/Unicode/0020-007F
The second is about encoding/decoding problem I think. You converted UTF-8 sequence to ordinal value(ASCII code). That chr() function returns charater from ASCII. and ASCII has no full width letter. So you have to convert from unicode.
I use ord() first to get ASCII code of the character and Added 65248. Then convert decimal to hex and placed behind of "\u" and covered with commas so I can use json_decode().
$str = "Game some other text by ヴィックサ";
$str = preg_replace_callback(
"/[\x{0021}-\x{007e}]/u",
function($c) {
return json_decode('"'.('\\u'.dechex (ord($c[0])+65248)).'"');
}, $str);
I couldn't use mb_convert_kana(). I don't know why but I think it's because I worked with Korean strings, not Japanese.
I'm not good at English but I hope this explanation helps you.
There's an easier way to do it:
$str = "Game";
// Becomes "Game"
$wideStr = mb_convert_kana($str, "R");