I am using Windows 2003 to write some PHP code. I use XAMPP Portable (copy to D:). The problem:
$path = 'D:\ebooks';
$all_file = scandir($path);
foreach ($all_file as $file) {
if (is_dir("$path/$file") && $file != '.' && $file != '..') {
echo $file . "<br />
";
}
}
When I call the script (with browser), I didn't see any directories (within D:\ebooks) containing a Unicode character (I tested with Vietnamese, Japanese, Chinese, Czech).
But if I remove is_dir("$path/$file")
, the directories display with many strange characters and many ???
characters.
How can I solve the problem?
Unfortunately, there are a lot of bugs related to PHP access to a windows filesystem. While Windows does store the filenames as UTF-16, PHP's internals use the much older ANSI api's. So it's best to only do stuff with filenames that are in the ascii range, or switch to a different operating system.
This might help with the Czech, it won't help with CJK (still looking for that myself). Buggy "Rationale": UTF-8 is ok, as long as it's mappable to the windows character set.
PHP is unpleasantly clear about this:
scandir() and readdir() do not support Unicode filenames
http://bugs.php.net/bug.php?id=34574
«PHP does not support unicode operations until PHP6.»
written in September 2005.
You could also try to grab the dos short name equivalent by using system('dir /X'), parse the results for short and long names and work from there. Display long names, actually accessing (for stat, timestamp and file open of course) with the short names...