在php中将文件从Linux下载到Windows时修复文件编码

Ok I have an issue. I have a Linux web server (RHEL 4 with apache 2) that is used to house an application. Part of this application is a set of php scripts. I created a script that accepts some form variables and then downloads a file to the user. Here si the code:

header('Content-Description: File Transfer');
header('Content-Type: application/octet-stream');
header('Content-Disposition: attachment; filename='.$destFileName);
header('Expires: 0');
header('Cache-Control: must-revalidate, post-check=0, pre-check=0');
header('Pragma: public');
header('Content-Length: ' . filesize($fullPath));
ob_clean();
flush();
readfile($fullPath);

This all works fine, and the file gets downloaded. But there is a problem. These files are being downloaded from the Linux box, to a Windows machine (ALWAYS). The problem is the encoding. When you look at the file on the Linux box all the text is aligned and all the columns look fine. (The files are just flat text files). But when the file gets downloaded onto the Windows box, and opened in Notepad, the file is all fouled up, and nothing is aligned. You also see weird charactors (the ones that look like a box, but that is just the generic representation for the unknow charactor). When this file is imported into another program, it does not work.

However, when I open the file up in WordPad, all the text look correct. If I save it from wordpad it will import correctly, and look correct in Notpad.

I don't have much knowlege on file encoding, so any information on how I can encode the file before sending to the user for download would be great.

I did try replacing the readfile($fullPath); with:

$handle = @fopen($fullPath, "r");
if ($handle) {
    while (!feof($handle)) {
        $buffer = fgets($handle);
        $buffer = str_replace('
', '
', $buffer);
        echo $buffer;
    }
    fclose($handle);
}

Thanks!

There's an issue with the following line:

$buffer = str_replace('
', '
', $buffer);

You'd need to use double quotes there. " " is newline. ' ' is the literal char sequence backslash-n:

# php -r "var_dump('
', \"
\");"
string(2) "
"
string(1) "
"

There is a Unix utility 'unix2dos' and 'dos2unix' that might help. You could call it from php as a system call.

Or, I'm sure there is a php version of the same thing.

But I'm not a php guy.

EDIT: I did not know that about PHP quoting. Still, you may need to decide on a standard encoding when multiple languages will be used, the rest of this post is still valid.

Windows generally uses ANSI or ASCII encoded files for text, using the character set that is local to the system. (Like cp1252).

It might be easiest just to encode it all in UTF8, and then tell notepad to read the file in as a UTF8 document. (It is a drop-down in the File->Open dialog.)

I do not see a way to specify the encoding from the command-line, and I am not sure that notepad will find it automatically.