I have the following problem when using tFPDF for generating pdf files in PHP. I receive UTF-8 strings in all kinds of languages (Japanese, Chinese, Arabic, Hindi, etc.) which I need print in the pdf correctly.
However, since there is no OTF/TTF font (at least I am not aware of that) which would support all kinds of alphabets at once, there is always some letters which are not printed correctly in the pdf. For example, doing:
$pdf->AddFont('AsianFonts', '', 'mplus-2c-regular.ttf', true);
$pdf->AddFont('Arabic', '', 'aealarabiya.ttf', true);
adds an "AsianFonts" font which supports Japanese, Chinese, etc. characters, and a separate "Arabic" fonts for Arabic, Hindi etc.
However, I cannot do:
$pdf->SetFont('AsianFonts', '', 10);
$pdf->Cell($string_to_be_printed);
because $string_to_be_printed
can be of any UTF-8 character and in this case the arabic letters won't be correct (they will be replaced by some dummy squares).
My idea was to somehow detect the language of the input string (based on the unicode?) and based on that I would set the specific font. Something like:
$language = detect_language($string);
switch ($language) {
case 'asian'
$pdf->SetFont('AsianFonts', '', 10);
case 'arabic'
$pdf->SetFont('Arabic', '', 10);
}
$pdf->Cell($string);
My question is - does anyone have a similar problem/experience? Is there maybe some better solution (like "one font to rule them all" - standard Dejavu, Arial seem not to work)? What is the best way to detect a language from a (UTF-8) string? Thanks.
OFF-TOPIC: How is it done in ''standard editors'' (e.g. Notepad++) that they have no problems with any character (even in the command line of putty console, all characters are shown correctly).